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REMARKS 



Claims 27-36 are pending in the present application. 

The Specification is amended to correctly specify the priority data of the instant 
application under the section "Cross-Reference to Related Applications." No new matter is added 
and entry of the amendments to the Specification and claims is respectfully requested. 

Reconsideration of the application is respectfully requested in view of the above 
amendments and the following remarks. For the Examiner's convenience, Applicant's remarks 
are presented in the order in which they were raised in the Office Action. 

A. Non-statutory Double Patenting Rejection 

(a) Claims 27-30 stand rejected under the judicially created doctrine of obviousness-type 
double patenting as being unpatentable over claims 1-4 of U.S. Patent No. 5,585,258. 

Claims 31-35 are rejected under the judicially created doctrine of obviousness-type 
double patenting as being unpatentable over claims 5-9 of U.S. Patent No. 5,585,258 in view of 
Benson* et al., U.S. Patent No. 5,258,496. Benson is cited for the teaching of recombinant fusion 
polyupeptides being comprised in compositions during purification from the host cell. 

Claim 36 is rejected under the judicially created doctrine of obviousness-type double 
patenting as being unpatentable over claims 1 and 3-5 of U.S. Patent No. 5,597,691. 

Claims 27 and 30 are rejected under the judicially created doctrine of obviousness-type 
double patenting as being unpatentable over claims 1 and 2 of U.S. Patent No. 5,712,145. 

Claims 31, 32 and 35 are rejected under the judicially created doctrine of obviousness- 
type double patenting as being unpatentable over claims 3-5 of U.S. Patent No. 5,712,145 in 
view of Benson et al., U.S. Patent No. 5,258,496. 

Claim 36 is rejected under the judicially created doctrine of obviousness-type double 
patenting as being unpatentable over claims 7 and 8 of U.S. Patent No. 5,712,145. 

Applicants submit that they will file a terminal disclaimer in the present application to 
disclaim any term beyond the term of the earlier expiring patents in order to overcome this 
ground for rejection, after the conflicting claims are found to be allowable. 
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(b) Claims 27 and 30 are provisionally rejected under the judicially created doctrine of 
obviousness-type double patenting as being unpatentable over claim 1 1 of copending 
Application No. 10/409,094, which is an application for reissue of U.S. Patent No. 5,585,258. 

Applicants submit that they will file a terminal disclaimer in the appropriate case the 
present application or copending Application No. 10/409,094 - to disclaim any term beyond the 
term of the earlier expiring patent in order to overcome this ground for rejection, after the 
conflicting claims are found allowable. 

Claim 36 is provisionally rejected under the judicially created doctrine of obviousness- 
type double patenting as being unpatentable over claim 6 of copending Application No. 
10/409,673, which is an application for reissue of U.S. Patent No. 5,597,691. 

Applicants submit that they will file a terminal disclaimer in the appropriate case — the 
present application or copending Application No. 10/409,673 - to disclaim any term beyond the 
term of the earlier expiring patent in order to overcome this ground for rejection, after the 
conflicting claims are found allowable. 

B. Rejections under 35 USC § 112 

1. Rejections under 35 U.S.C. §112, first paragraph - written description 

Claims 27, 31, and 36 (and dependent claims thereof) stand rejected under 35 U.S.C. § 
1 12, first paragraph for lack of written description, as containing subject which was not 
described in the Specification in such a way as to reasonably convey to one skilled in the 
relevant art that the inventors, at the time the application was filed, had possession of the claimed 
invention. 

Specifically, the Examiner contends that the Specification fails to exemplify or describe 
the preparation of, or recite any structural features of, the "proteolytic hepatitis C virus 
polypeptides" recited by the claims. (Office Action at p. 5). The Examiner states that the 
Specification fails to identify an amino acid sequence that constitutes a proteolytic HCV 
polypeptide or describe its purification. Id. The Examiner also states that the application fails to 
disclose a domain protease with proteolytic activity and contends that Example 5 of the 
Specification suggests that the proteolytic products detected by ELISA could only have been 
produced by endogenous proteases. The Examiner further states that no "purified proteolytic 
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HCV peptide" of claim 36 is shown to cleave any peptide substrate and no identifying 
characteristics of a generic NS3 domain protease is shown. 

Applicants respectfully traverse for the following reasons: 

(i) A Written Description of a "proteolytic hepatitis C virus polypeptide" that comprises a 
hepatitis C virus NS3 domain protease or an active NS3 domain hepatitis C virus protease 
truncation analog is provided . 

"[T]he 'essential goal' of the description of the invention requirement is to clearly convey 
the information that an applicant has invented the subject matter which is claimed." In re Barker, 
559 F.2d 588, 592 n.4, 194 USPQ 470, 473 n.4 (CCPA 1977). The test for sufficiency of support 
in a parent application is whether the disclosure of the application relied upon "reasonably 
conveys to the artisan that the inventor had possession at that time of the later claimed subject 
matter." Ralston Purina Co. v. Far-Mar-Co., Inc., 772 F.2d 1570, 1575, 227 USPQ 177, 179 
(Fed. Cir. 1985) (quoting In reKaslow, 707 F.2d 1366, 1375, 217 USPQ 1089, 1096 (Fed. Cir. 
1983)). 

(a) An NS3 domain is described in the Specification 

The Specification states that: "[t]he term 'HCV protease' refers to an enzyme derived 
from HCV which exhibits proteolytic activity, specifically the polypeptide encoded in the NS3 
domain of the HCV genome." (Specification, page 6, lines 22-25) An HCV NS3 domain 
protease sequence is provided in Figure 1 of the Specification. (Specification, page 3, line 7). 
The Specification points to a specific section in the NS3 domain as the key to proteolytic activity 
and notes that the termini of the relevant section are putative. (Specification, page 6, line 24 
through page 7, line 21). The Specification describes an NS3 domain of HCV. Page 8, lines 7-25 
refer to NS3 domain by analogy with the Yellow Fever Virus (a flavivirus) polyprotein. An HCV 
protease encoded by the NS3 domain in at least one strain of HCV is further described with 
reference to a 202 amino acid protease sequence from SEQ ID NO: 1 in page 6, line 22 to page 
7, line 1 8 (see SEQ ID NO: 65). 

An "active" truncation analog is one that exhibits proteolytic activity, a property that one 
can ascertain by running a limited number of standard experiments. The Specification describes 
how one would determine the structure of the shortest active HCV NS3 protease by truncation 
analysis. (Specification, page 7, line 27 - page 8, line 6). 
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In a European patent application EP 318,216A1 (published May 31, 1989), the inventors 
of the present application had previously reported 1 the nucleotide sequence of the HCV genome 
and identified a similarity between a 530 amino acid domain of the HCV polyprotein sequence 
and the NS3 protein sequence of dengue virus, a flavivirus. (p. 52, sec. IV.H.3 and Figs. 41-1 
and 41-2). Likewise, in PCT application WO 89/04669 a correlation between HCV polyprotein 
and a nonstructural protein of the flavivirus was noted, (p. 128, sec. IV.H.3). The disclosures of 
both WO 89/04669 and EP 318,216 (Houghton et al.) are incorporated by reference in the instant 
Specification at page 4, lines 4-8. 

NS3 domains in flaviviruses such as yellow fever virus were known in the art. (see Fig. 1 
and page 731 in Rice CM et al, Science, 229(4715):726-733 (1985)). EP 388,232 by the same 
inventive entity as the current application and published September 19, 1990, identified the NS3 
domain in comparison with flaviviruses. (pages 33-34 of EP 388,232). Other publications 
identifying an NS3 domain protease of HCV were available prior to the filing of the priority 
application of the current application. Computer aided comparative analysis of the polyproteins 
of several flaviviruses was known to have sequence similarity with HCV in the NS3 region. 
(Miller at al Proc. Natl. Acad. Sci. 87:2057-2061, at 2060 and Fig. 3 (March 1990)). Yoneyama 
et al disclose the use of PCR primer from the NS3 region of HCV for detection of viral 
sequences. (Jpn. J. Med. Sci. Biol. 43:89-94 (1990)). 2 

(b) An NS3 domain protease is disclosed in the Specification. 

Independent claims 27 and 36 specify a composition comprising a purified hepatitis C 
virus polypeptide which itself comprises "an HCV NS3 domain protease or an active HCV NS3 
domain protease truncation analog." Independent claim 31 specifies a composition comprising a 
purified hepatitis C virus polypeptide which itself comprises a fusion protein containing "a HCV 
NS3 domain protease or ... an active HCV NS3 domain protease truncation analog." The 

* 

remaining dependent claims are generally limited to truncation analogs containing the amino 
acid sequence of SEQ ID NOS: 63-65. The claims are not directed to a specific kind of protease 
activity, they are directed to any protease activity encoded by the NS3 region. 



1 EP 318,216A1 was filed on Nov. 11, 1988 and published May 31, 1989, prior to the earliest priority date of this 
application. 

2 Courtesy copies of the references mentioned in the response are attached as Exhibit H for the Examiner's 
convenience. Only the relevant pages of EP 388,232, WO 89/04669 and EP 318,216 are enclosed. 
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A protease activity associated with the NS3 domain is characterized in Example 5 
(Specification, page 31, lines 12-17) which shows self cleavage of SOD - HCV protease fusion 
proteins expressed in E. coli? Example 4 (Specification, page 29, line 4 through page 30, line 6) 
describes the amino acids of HCV protease encoded by each fusion protein. 

• The PI 90 fusion product encoding amino acids 1-199 of the HCV protease (page 29, 
lines 19-20) showed no protease cleavage activity (Specification, page 32, lines 8-12). 

• P300 which includes amino acids 1-299 of HCV protease (page 29, lines 25-26) 
indicated occurrence of cleavage (Specification, page 32, lines 1-7). 

• P500 comprising amino acids 1-513 of Fig. 1 (page 30, lines 4-6) indicated 
occurrence of cleavage (Specification, page 31, lines 22-25). 

* 

• The fusion protein ("P600") encoded by the vector cfl SODp600 which includes 
amino acids 1-686 of Fig. 1 also showed proteolytic activity. (Specification, page 31, 
lines 12-17). 

• The Specification concludes that "the minimum essential sequence for HCV protease 
extends to the region between amino acids 199 and 299." (Specification, page 32, 
lines 10-12). 

The Examiner incorrectly assumes that the proteolytic cleavage described in Example 5 is 
attributable solely to the host cells' endogenous proteases. Only in subsection A of Example 5 
which describes the protease activity of the P600 fusion protein resulting in "34, 53 and 66 kDa" 
bands, the 53 and 66 kDa bands are surmised to have undergone "varying degrees of (possibly 
bacterial) processing" as the predicted product of theoretical M r = 93 kDa was not observed. 
(Specification, page 31, lines 13-17). 

Protease activity attributable to the NS3 region was evident in the P300 and P500 fusion 
proteins and no "possibly bacterial" processing is suggested in Examples 5 (B) and (C) as the 
predicted proteolysis products of theoretical M r = 51 and 73 kDa respectively were observed. 
(Specification, page 31, lines 22-25, and page 32, lines 1-2). That the protease activity resides in 



"The results indicated the occurrence of cleavage, as no full length product (theoretical M r = 93 kDa) was evident 
on the gel." (Specification, page 31, lines 12-13). 
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the HCV NS3 region is confirmed by the observation in Example 5(C), where the PI 90 fusion 
product encoding amino acids 1-199 of the HCV protease did not show any protease cleavage 
activity (Specification, page 32, lines 8-12). 

(c) A peptide substrate for the NS3 domain protease is provided in the Specification 

The Examiner further contends that the application fails to disclose a proteolytic HCV 

peptide encoded by a polynucleotide or expression vector that is capable of encoding that could 

cleave any peptide substrate. Applicants respectfully traverse and submit that a peptide substrate 

for a HCV NS3 associated protease is disclosed in the Specification. The protease activity 

described in Examples 5(A), (B), and (C) was observed through self-cleavage of an hSOD-HCV 

fusion protein wherein the HCV peptide portion corresponded to amino acids 1-686 of Fig. 1 and 

■ 

various truncations thereof. Observance of specific cleavage within the NS3 region is described 
in every instance where protease activity was observed. For example, "34 kDa band 
corresponding] to the hSOD partner (about 20 kDa) with a portion of the NS3 domain" was 
observed in each case with the P600, P300 and P500 fusion proteins of NS3 fused to a hSOD 
leader. 

Applicants submit that the Specification describes a protease activity specifically 
associated with the NS3 region and provides disclosure of a substrate for such protease activity. 
Thus one of skill in the art would have identified the NS3 domain described in the Specification 
and understood that at the time of filing of the application, the inventors had possession of the 
claimed invention. Therefore, applicant respectfully requests withdrawal of this ground for 
rejection for lack of written description under 35 U.S.C. § 1 12, first paragraph. 

(d) NS4A is not essential for the activity of an NS3 domain hepatitis C virus protease or 
truncation analog 

The Examiner refers to several references submitted in an IDS by the Applicants to 
contend that NS3 domain hepatitis C virus protease requires another region termed NS4A. 
Applicants respectfully traverse. 

The claims of the current application generally specify a composition comprising a 
purified hepatitis C virus polypeptide which itself comprises "an HCV NS3 domain protease or 
an active HCV NS3 domain protease truncation analog." The claims are not directed to a 
specific kind of protease activity but any protease activity encoded by the NS3 region. The 
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claims do not specify a particular kind of protease. The NS4A cofactor referred by the Examiner 
relates to the activity of a "serine protease activity" encoded by the NS3 region. The 
Specification clearly demonstrates a protease activity associated with a protein comprising amino 
acids 1-299 of HCV protease {see Example 5 (C)). While a serine protease activity also encoded 
within this region may optionally require a NS4A cofactor, applicants' claims are directed to any 
protease activity within the NS3 region. Further, applicants note that "while NS4A appears to be 
absolutely required for /raws-cleavage at the 4B/5 A site, it is not an essential cofactor for serine 
protease activity." {see Abstract, lines 10-11, page 8151 right column (first full paragraph) of Lin 
et ai 9 J. Virol. 68(12): 8147-8157 (1994)). Further c/s-cleavage by NS3 domain proteases do not 
require NS4a. (Lin et al p. 8149, right col; p. 8152, right col.; Fig. 7A; p. 8155, left col.). 

Applicants submit that the Specification describes a protease activity specifically 
associated with the NS3 region and provides disclosure of a substrate for such protease activity. 
Thus one of skill in the art would have identified the NS3 domain described in the Specification 
and understood that at the time of filing of the application, the inventors had possession of the 
claimed invention. Therefore, applicants respectfully request withdrawal of this ground for 
rejection for lack of written description under 35 U.S.C. § 1 12, first paragraph. 

2. Rejections under 35 U.S.C. §112, First paragraph - enablement 

Claims 27-36 are rejected under 35 U.S.C. § 1 12, first paragraph, because the 
Specification, while being enabling for recombinant expression of a catalytic component of a 
hepatitis C virus protease comprising the amino acid sequence set forth in SEQ ID NO:66, does 
not reasonably provide enablement for preparation of compositions, or expression vectors, that 
comprise polynucleotides encoding a proteolytically active hepatitis C virus protease, whether or 
not fused to a fusion partner. The Specification does not enable any person skilled in the art to 
which it pertains, or with which it is most nearly connected, to make and use the invention 
commensurate in scope with these claims. 

Specifically, the Examiner contends that while appropriate analogies are made in the 
Specification between SEQ ID NO: 66, serine protease characteristics, and analogous regions in 
other flaviviruses, no guidance is provided for making such a protease. The Examiner states that 
the Specification "does not describe, thus cannot enable, an integral hepatitis C virus protease 
capable of cleaving a defined substrate." (Office Action, page 8). The Examiner also states that 
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the small peptides specified in claims 28, 29, 33 and 34 (i.e., of SEQ ID NOS: 63 and 64) "are 
insufficient to support proteolysis even if Applicants 1 disclosure had provided [such] guidance. 
Id. 

Applicants respectfully traverse these grounds for rejection. 

(i) The Specification enables a NS3 domain protease of the amino acid sequences of SEP 
ID NOS: 63, 64, 65 and 66 

To be enabling, the Specification must teach those skilled in the art how to make and use 
the full scope of the claimed invention without undue experimentation. (Genentech Inc. v. 
NovoNordiskA/S, 108 F.3d 1361, 42 USPQ2d 1001 (Fed. Cir. 1997)). 

(a) Applicants submit that the Wands factor cited by the Examiner, i.e., specific guidance 
about the portion of the HCV polyprotein responsible for recognition of native cleavage sites in 
the polyprotein, is satisfied in the Specification. Specifically the truncation analysis described in 
Example 5, wherein a minimal region between 199-299 amino acids of Fig. 1 is shown to have 
protease activity with specific cleavage occurring within the HCV NS3 portion of the hSOD- 
HCV fusion protein. 

As submitted above, the Specification identifies a HCV NS3 region. As described in 
Example 5, a protease activity is shown to be associated with amino acids 1-299 of Fig. 1 (P300 
fusion protein; SEQ ID NO: 66). Example 5 also shows that no protease activity is observed 
within amino acids 1-199 of Fig. 1 (P190 fusion protein; SEQ ID NO:67). SEQ ID NO: 65 
extends from amino acid 60-262 of Fig. 1. Thus the amino acid sequence essential for the 
protease activity is located within amino acids 200 and 262 of the given sequence. One of skill 
in the art has only a definite and specific region of the amino acid sequence to identify the 
protease activity and is able to do so without undue experimentation. 

SEQ ID NOS: 63 and 64 specify 1 1 and 9 amino acid sequences within SEQ ID NO: 65 
and are specified in dependent claims 28 and 29 which depend from independent claim 27, and 
claims 33 and 34 which depend from independent claim 31. Independent claims 27 and 31 both 
specify a "proteolytic hepatitis C virus (HCV) polypeptide wherein said HCV polypeptide 
comprises .... [a] protease. 1 ' (emphasis added). The proteolytic HCV polypeptides according to 
claims 28, 29, 33 and 34 need only have "a partial internal amino acid sequence comprising" 
SEQ ID NOS: 63 and 64. 
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* 

SEQ ID NOS: 63 and 64 span 1 1 and 9 amino acid sequences within SEQ ID NO:65 and 
are within the protease domain of amino acids 200 and 262 of the given sequence of Fig. 1. 
Further, SEQ ID NOS: 63 and 64 span a histidine and a serine containing region respectively of 
sequences homologous to regions responsible for serine protease catalytic activity in Yellow 
Fever Virus, West Nile Fever virus, Murray Valley Fever virus, and Kunjin virus (Table 1) and 
in the well-characterized serine proteases: protease A from Streptomyces griseus, a-lytic 
protease, bovine trypsin, chymotrypsin, and elastase (Table 2). {see page 8, line 7 - page 9, line 
17). Thus, by structural homology and alignment, SEQ ID NOS: 63 and 64 are disclosed in the 
Specification to be associated with protease activity. 

While the Specification notes characteristic similarities with a serine protease the claims 
are directed more broadly to a "protease" activity within the NS3 domain. Applicants are not 
required to correctly set forth, or even know, how and why the claimed NS3 region demonstrates 
protease activity, see Enzo Biochem v. Calgene, Inc., 188 F.3d 1362, 1375 (Fed. Cir. 1999) ("it 
is not a requirement of patentability that an inventor correctly set forth, or even know, how or 
why the invention works"). 

(b) The Examiner also alleges a lack of working examples of an assay that could measure 
inhibition of an HCV protease. Applicants respectfully traverse. 

As discussed above, the Specification discloses an NS3 domain with protease activity 
residing in a region defined by truncation analysis. Examples 4 and 5 discloses several hSOD- 
protease fusion proteins (cflSODp600, P300, P500) that act specifically as substrates of the NS3 
domain protease. Given a protease and a substrate one of skill in the art would be able to assay 
for inhibitors of the protease activity. Inhibitors such as organic compounds, peptide inhibitors 
and antibodies and methods for designing them are disclosed on pages 17-18 of the 
Specification. General methods for screening protease inhibitors are listed at pages 18-19 of the 
Specification. 

(c) The Examiner also states that "in view of the publications made of record herein," the 
existing state of the art at the time of filing does not support the identification of other, distant 
regions of flavivirus sequence that confer cleavage specificity. Applicants respectfully traverse. 
As discussed above in Section l(i)(d) the Specification describes a protease activity specifically 
associated with the NS3 region. The claims are not directed to a specific kind of protease activity 
but any protease activity encoded by the NS3 region. Applicants submit that the Specification 

11 

pa-9 12645 



Application No.: 09/884,455 



Docket No.: 223002010004 



describes a protease activity specifically associated with the NS3 region and provides disclosure 
of a substrate for such protease activity, (see discussion under section l(i)(d) above). 

Applicants respectfully request that in view of the description of an NS3 domain, a 
protease activity associated with the NS3 domain, and identification of a substrate specifically 
cleaved by the NS3 protease in the Specification, the rejection for lack of enablement under 35 
U.S.C. § 1 12, first paragraph be withdrawn. 

3. Rejections under 35 U.S.C. §112, Second paragraph - indeflniteness 

a. Indeflniteness of the terms "domain protease" and "truncation 
analog" 

Claims 27-36 stand rejected under 35 U.S.C. § 1 12, second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which applicant 
regards as the invention. Specifically, the Examiner concludes that independent claims 27 and 
31 are indefinite in reciting, "proteolytic hepatits C virus polypeptide . . . comprising] an HCV 
NS3 domain protease or an active . . . truncation analog," because the Specification does not 
provide a specific, limiting, structural description of a generic NS3 domain protease, and thus 
one could not determine what is more than the protease and what is a truncation analog. 

Applicants respectfully traverse. The application, at page 5, line 20 through page 6, line 
4, refers to NS3 domain by analogy with the Yellow Fever Virus polyprotein. An HCV protease 
encoded by or within the NS3 domain is further described with reference to a 202 amino acid 
protease within SEQ ID NO: 1 in at least one strain of HCV in page 6, line 26 though page 7, 
line 18. SEQ ID NO: 65 consists of the corresponding 202 amino acid sequence from Figure 1 
(amino acids 60-262). 

The protease activity associated with HCV NS3 domain is further characterized in 
Example 5 (Specification, pages 31-32) as discussed in detail above. The Specification identifies 
by truncation analysis described in Example 5 a "minimum essential sequence for HCV protease 
[that] extends to the region between amino acids 199 and 299." (Specification, page 32, lines 10- 
12). 

Further, Examples 4 and 5, as discussed above, disclose active truncation analogs of the 
HCV NS3 domain protease. The Specification on pages 29-32 disclose a fusion protein P600 
including amino acids 1-686 of Fig. 1 which demonstrates protease activity. Active truncation 

12 
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analogs P500 comprising amino acids 1-513 of Fig. 1, and P300 comprising amino acids 1-299 
also demonstrate protease activity associated with the NS3 domain. (Specification, page 31, line 
5 - page 32, line 7). 4 

Applicants submit that the terms "HCV NS3 domain protease" and "active ... truncation 
analog" are clearly defined in the Specification and request withdrawal of this ground for 
rejection. 

b. Indefiniteness of the term "purified" 

The Examiner also states that claims 27 and 31 (and dependent claims) are indefinite 
because they recite a composition comprising a purified polypeptide, where no polypeptide can 
remain "purified" when present in a composition. Claim 36 is likewise indefinite because it 
recites a method involving the step of providing a purified polypeptide. According to the 
Examiner, the claims also do not provide for purification of proteolytic HCV polypeptides. 

Applicants respectfully traverse the Examiner's interpretation of a composition 
comprising a "purified" polypeptide. Claims 27 and 31 specify compositions comprising the 
"purified" HCV polypeptides as a component of the compositions. Absolute purity of HCV 
polypeptides within the compositions are not claimed. 

The ordinary meaning of "purify" is to "free from undesirable elements." (Merriam- 
Webster's Collegiate Dictionary, 10 th ed. 2002. Merriam- Webster, Inc. Springfield, MA; Exhibit 
E) Therefore, a "purified" substance can exist in a composition so long as it is free from 
undesirable elements. Accordingly, a "purified" proteolytic HCV polypeptide can exist in a 
composition. 

Further, the Specification uses the term "purified" in a manner consistent with this 
meaning. For instance, it discloses that a calcium dependent monoclonal antibody, which binds 
to the FLAG encoded peptide, can be used to purify the fusion protein without harsh eluting 
conditions. (Page 31, lines 2-3). The "purified" protein is obtained as an antibody-protein 
complex, a composition comprising a purified protein. Similarly, Example 6 of the Specification 
describes a method of purifying protease from E. coli by SDS-PAGE. The final purified product 



4 Truncation analog PI 90 comprising amino acids 1-199 does not demonstrate proteolytic activity, and is thus an 
"inactive" truncation analog. (Specification, page 32, lines 8-10). 
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is eluted from a gel. (Page 33, lines 20-23). In this instance, the "purified" protease is part of a 
composition comprising the eluate. 

Therefore, Applicants respectfully request that the rejection for indefiniteness under 35 
U.S.C. § 1 12, first paragraph be withdrawn. 

C. Rejections under 35 USC § 103 

Claims 27-35 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Miyamura et al., U.S. 5,372,928, in view of Miller et al., 1990, Proceedings of the National 
Academy of Sciences, U.S.A, Vol. 87, pages 2057-2061; Bazan et al., 1989, Virology, Vol. 171, 
pages 637-639 and Gorbalenya et al., 1989, Nucleic Acids Research, Vol. 17, pages 3889-3897. 

Applicants respectfully traverse this rejection because it does not establish prima facie 
obviousness of the claimed inventions. In particular, the key teachings in the Miyamura patent 
relied upon in the rejection are not available as prior art against the claimed inventions as a 
matter of law. Since the Miyamura patent is the primary reference, the rejection should be 
withdrawn. 

The Office Action characterizes the Miyamura patent as follows: 

(1) Miyamura et al., see Figures 12A-C, teach a polynucleotide encoding the HCV1 
strain polyprotein and the relative positions of both the structural and the non-structural domains 
within the polyprotein encoded by the nucleic acid sequence of the HCV1 strain, see Miyamura 
patent, columns 6-7 and Figure 1 1 . Office Action at 10. 

(2) Miyamura et al. teach, at lines 8-10 of col. 7, that the "putative NS3 [domain 
extends] from about amino acid 1007 to about amino acid 1650'." Id. (text within brackets in 
original). 

(3) Miyamura et al. teach, at col. 17, lines 5-21, that functions of domains within the 
HCV polyprotein may be predicted on the basis of similarities shared by the HCV polyprotein 
amino acid sequence and flavivirus polyprotein amino acid sequences and that flavivirus NS3 
domains have an amino acid sequence region that provides a protease function. Office Action at 
10-11. According to the Examiner, these teaching have priority to September 15, 1989, when 
they appeared in the parent application serial No. 07/408,045. 
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(4) At Examples I-IV at cols. 28-39, Miyamura et al. "teach preparation of cloning 
vectors, and transformed host cells comprising the vectors, comprising inserts of specific, 
defined, regions found anywhere in a nucleic acid sequence encoding all or part of an hepatitis C 
virus polyprotein." Office Action at 1 1. Miyamura et al. "explicitly teach, at cols. 8-10, that 
expression vectors comprising transcriptional and translational regulatory elements operably 
linked to a polynucleotide encoding a desired regions [sic] of the HCV polyprotein should be 
used to produce desired portions of the hepatitis C virus polyprotein in host cells." Id, 
Miyamura et al. "further suggest preparation of expression constructs providing fusions of 
hepatitis C virus amino acid sequence regions with proteins commonly used in the art as fusion 
partners such as jS-galactosidase and superoxide dismutase [SOD]" at cols. 14-15. Office Action 
at 1 1 (bracketed text in original). 

1, The Instant Application Claims Priority to U.S. Patent No. 5,371,017 

The instant application has priority of Application No. 07/680,296, filed on April 4, 1991, 
(now U.S. Patent No. 5,371,017) and its Specification is identical to that of the '017 patent. 5 The 
inventors of the instant application and the "017 patent are identical. 

As discussed below, Miyamura et al. is not available as § 102(e) prior art against the c 017 
patent. Since the instant application claims priority (and contains identical disclosure) to the 
application from which the 6 01 7 patent issued, Miyamura is not available as prior art against the 
instant application. 

2. The Miyamura Patent Cannot be Relied Upon as § 102(e) Prior Art 
With Respect to the HCV-1 ORF Sequence Information Shown in 
Figure 12, the Putative Genomic Organization Shown in Figure 11, 
and the Subject Matter of Column 7, Lines 8-9 and Column 17, Lines 
17-21 of the Miyamura Patent 

Subject matter disclosed in the Miyamura patent qualifies as prior art under 35 U.S.C. 
§ 103 only if it meets the requirements of 35 U.S.C. § 102(e). The Office Action relies in part on 
Figure 12, Figure 1 1 and "cols. 6-7" of the Miyamura patent, stating that the patent teaches a 

5 The instant application, 09/884,455, is a continuation of Application No. 09/253,675, which is a continuation of 
Application No. 08/709,177 (U.S. Patent No. 5,885,799), which is a continuation of Application No. 08/440,548 
(U.S. Patent No. 5,597,691), which is a divisional of Application No. 08/350,884 (U.S. Patent No. 5,585,258), 
which is a divisional of Application No. 07/680,296, filed April 4, 1991, (U.S. Patent No. 5,371,017). 
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polynucleotide encoding the HCV1 strain's polyprotein and "the relative positions of the 
structural and the non-structural domains within the polyprotein." Office Action at 10. The 
Office Action also relies in part on the Miyamura patent at column 7, lines 8-10, asserting that it 
teaches there that the "putative NS3 [domain extends] from about amino acid 1007 to about 
amino acid 1650" (bracketed text in original). Id. The Office Action also relies in part on the 
Miyamura patent at column 17, lines 5-21, stating that the patent there teaches "that functions of 
domains within the hepatitis C virus polyprotein may be predicted on the basis of similarities 
shared by amino acid sequence of flaviviruses and the hepatitis C virus amino acid sequence and 
that a protease function resides in the amino acid sequences of flavivirus NS3 domains." Id. at 



Applicants note that the Office Action points out that the teachings of the Miyamura 
patent have priority to September 15, 1989, when they appeared in the parent application 
07/408,045. Office Action at 1 1. As noted by the Examiner, the Miyamura patent makes clear 
that the Figure 12 sequence, as well as the genomic organization information set out in Figure 1 1 
and at column 7, lines 8-9, concerns HCV-L See Miyamura patent, column 4, lines 33-36; 
column 6, line 65 to column 7, line 16. Similarly, the context surrounding the cited material at 
column 17, lines 16-21 of the Miyamura patent makes clear that this prediction is premised on 
HCV-1 sequence data. As shown below, this HCV-1 subject matter was derived by the 
Miyamura inventive entity from the inventive entity of the '017 patent, which is the same 
inventive entity as the instant application. 

Applicants note that the Miyamura patent does not claim any HCV-1 sequences or 
methods, but rather specifically disclaims HCV-1. See Miyamura patent, column 40, line 47 to 
column 42, line 30, claims 1-6 (all of which contain the limitation "wherein said sequence is not 
homologous to the nucleotide sequence of HCV isolate HCV1"). Similarly, the Miyamura 
Specification makes clear that Miyamura's invention relates to Jl and J7 HCV isolates, and not 
to HCV-1. See, e.g., Miyamura patent, column 1, lines 18-19, and column 2, line 34 to column 
3, line 65. 



As a further evidentiary submission, Applicants provide thepeclaration of Tatsuo 
Miyamura Under 37 C.F.R. § 1.132 ("Miyamura Decl.", Exhibit A hereto, originally submitted 



10-11. 



a. 



Miyamura's Derivation of HCV-1 Subject Matter From The 
Inventive Entity Of The Instant Application 
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during the Reexamination of the priority patent 5,371,017), who is the first-named inventor on 
the Miyamura patent. Dr. Miyamura states that the Miyamura patent arose from a collaboration 
between himself and his colleague Dr. Izumi Saito, with Dr. Houghton and his colleagues at 
Chiron. Miyamura Decl. ^[5. Dr. Miyamura declares that Dr. Houghton provided him with the 
HCV-1 ORF sequence shown in Miyamura Figure 12 and the information regarding the HCV-1 
putative genomic organization shown in Miyamura Figure ll. 6 Id. at \6. Dr. Miyamura further 
states that neither he nor his colleague Dr. Saito independently determined this information prior 
to the filing of the applications for the Miyamura patent. Id. Dr. Miyamura further declares that 
the sentence at Miyamura column 17, lines 17-21 7 "reflects work done by Dr. Houghton and his 
colleagues, not by Dr. Saito and myself. I believe that sentence was the contribution of Dr. 
Houghton." Id. at f7. 

Applicants also point to the Declaration for Continuation-in-Part Application submitted 
to the PTQ by Drs. Houghton, Choo and Kuo when the application for the '017 patent was filed, 
a copy of which is attached as Exhibit B hereto. In that declaration Drs. Houghton, Choo and 
Kuo declare that they are the "original, first and joint" inventors of the subject matter which is 
claimed and for which a patent is sought. Exhibit B hereto. 



Applicants note that the sequence information in Figure 12 of the Miyamura patents and the genomic organization 
information in Figure 1 1 of the Miyamura patents together provide the subject matter at Miyamura '928 patent, 
column 7, lines 8-9 (i.e., prediction of a putative NS3 domain from about amino acid 1007 to about amino acid 
1650). First, the differing numbering schemes of Figures 1 1 and 12 must be normalized to one another. In 
Figure 12, the first nucleotide of the first translated codon (part of the "putative initiator methionine") is numbered 
as nucleotide 320. In contrast, the first nucleotide of the first translated codon in Figure 1 1 corresponds nucleotide 
1. This may be deduced, for example, because: (a) the protein encoded by the putative C domain is described as 
having approximately 1 15 amino acids (Miyamura '928 patent, column 6, line 67 to column 7, line 2); (b) the 3' 
boundary of the C domain in Figure 1 1 is designated as nucleotide 345; and (c) since a codon consists of three 
nucleotides, the first nucleotide of Figure 1 1 must represent the first nucleotide of the first translated codon (i.e., 3 x 
115 = 345). A nucleotide in Figure 1 1 can thus be correlated to a nucleotide in Figure 12 by adding 319. 

The Figure 1 1 putative boundary numbers (which are all divisible by three) must each represent the final nucleotide 
of a putative domain, because nucleotide 345 corresponds to the final nucleotide of a 1 15-amino acid reading frame. 
Thus, the designation "3018 nt" indicates the final nucleotide of the final codon of NS2. Further, adding 3 19 to 
nucleotide 3019 of Figure 1 1 yields nucleotide 3338 of Figure 12, which corresponds to amino acid 1007. Similarly, 
the designation "4950 nt" indicates the last nucleotide of the last codon of the putative NS3 domain. Adding 319 to 
nucleotide 4950 in Figure 1 1 yields nucleotide 5269 in Figure 12, which corresponds to amino acid 1650. Thus, the 
putative amino acid range for NS3 disclosed at Miyamura '928 patent, column 7, lines 8-9, corresponds exactly to 
the nucleotide numbers of Miyamura Figure 11. 

7 At column 17, lines 17-21, the Miyamura '928 patent recites: "Due to the observed similarities between HCV and 
the Flavi viruses, deductions concerning the approximate locations of the corresponding protein domains and 
functions in the HCV polyprotein are possible." 
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b. 



The Legal Standard For Section 102(e) Prior Art 



It has long been established that, absent a Section 102(b) statutory bar, an inventor's own 
work cannot be held against him as prior art under 35 U.S.C. § 102(e). Thus, for a reference 
patent to qualify as prior art under Section 102(e), (1) the application for the reference patent 
must have been by one who is legally "another" and (2) the filing date of the reference patent 
must be "before the invention thereof by the applicant ... 35 U.S.C. § 102(e). A patent cannot 
be relied upon as prior art under 35 U.S. C. § 102(e) when the record establishes that the relevant 
disclosure relied upon in the rejection is the applicant's own work, and furthermore that the 
relevant portions of the reference patent were obtained from the applicant. See MPEP § 2136.05; 
In re Mathews, 408 F.2d 1393, 161 USPQ 276 (CCPA 1969); In re Land, 368 F.2d 886, 151 
USPQ 621 (CCPA 1966). "When the 102(e) reference patentee got knowledge of the applicant's 
invention from him, as by being associated with him . . . and thereafter describes it, he 
necessarily files the application after applicant's invention date . . . Mathews, 408 F.2d at 
1396, 161 USPQ at 279 (quoting Land, 368 F.2d at 879, 151 USPQ at 633 (emphasis original)). 



Applicants respectfully submit that the record here cannot support a conclusion that the 
Miyamura HCV-1 subject matter relied upon in the Office Action is the invention of "another:" 

• The Miyamura patents do not claim the claimed subject matter of the instant application. 

• A collaboration between Dr. Houghton and Drs. Miyamura and Saito is evidenced by the 
recorded assignment for Serial No. 408,045 (the earliest-filed application from which 
Miyamura claims the benefit of filing date), which is attached as Exhibit I to this Response. 
In that assignment, Drs. Miyamura and Saito assign their rights in the invention to Chiron 
Corporation as a co-assignee with the Director General of the National Institute of Health of 
Japan. Further evidencing a collaboration is the fact that Michael Houghton is named as a 
joint inventor on U.S. patent application Serial No. 637,380, filed January 4, 1991, for the 
Miyamura patent. This establishes a collaboration in connection with the September 15, 



The subject matter at issue is the Miyamura '928 patent disclosure pertaining to the HCV-1 ORF sequence 
(Figure 12), the putative genomic organization of HCV-1 (Figure 1 1 and column 7, lines 8-9), and column 17, lines 



C. 



The Subject Matter Of Figures 1 1 and 12 and at Column 7, lines 8- 
10 and Column 17, lines 5-21 of the Miyamura Patent Is Not 
Citable As Prior Art 



17-21. 
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1989 filing date of Serial No. 408,045 (which included the subject matter relied upon in the 
Office Action but did not name Dr. Houghton as a joint inventor). 

• Prior to the earliest filing date of Miyamura (September 15, 1989), similar or identical 
subject matter appeared in applications filed by the inventive entity of the instant 
application). See United States patent application Serial No. 07/355,002, filed May 18, 1989 
(the '002 application") 9 , Figures 62, 62.1 and 62.2 (corresponding to Miyamura Figure 12); 
page 45, lines 8-13 (corresponding to Miyamura patent at column 17, lines 16-21); and page 
123, line 26 (corresponding approximately to Miyamura Figure 11, and Miyamura patent at 
column 7, lines 8-9). 10 The '002 application was incorporated by reference in the earliest- 
filed Miyamura application. 11 The European counterpart of the c 002 application is 
incorporated by reference in Miyamura (i.e., EP 388,232). Miyamura patent, column 5, 
lines 37-44. 

• The declaration by the inventors of the '002 patent application, submitted when that 
application was filed, in which they averred that they were the "original, first and joint 
inventors ... of the subject matter which is claimed and for which a patent is sought . . ." 
(copy attached as part of Exhibit F to this Response). 

• Prior to the earliest filing date of Miyamura (September 15, 1989), HCV1 polynucleotide 
sequences in the HCV NS3 region appeared in applications filed by the inventive entity of 
the instant application. See Figures 32 and 47, EP 318,216 to Houghton et al. 12 The *216 
application is incorporated by reference in the Miyamura patent. See Miyamura patent, 
column 5, lines 37-44. 

Applicants believe that these facts prevent a conclusion that the Miyamura subject matter at issue 
is that of "another." 

The Miyamura Declaration, coupled with the declaration by the inventors of the '017 
patent (and the instant application) that was filed with the application for the '017 patent, further 
supports that the Miyamura HCV-1 disclosure relied upon by the Examiner was obtained by the 

9 This application was incorporated by reference into Serial No. 07/456,637 (Exhibit D hereto, page 1, lines 10-12; 
page 2, lines 2-3), which itself was incorporated by reference into the '017 patent (see, e.g., '017 patent, column 2, 
lines 43-49). 

10 For the convenience of the Examiner a copy of the following from the '002 application is attached as Exhibit F to 
this Response: Figure 62, 62-1 and 62-2, and pages 17 (describing Figure 62), 45 and 123. Also attached as Exhibit 
F to this Response is a copy of the Filing Receipt for the '002 application, showing the inventorship. 

11 See page 8, lines 3-4 of the 'Miyamura '045 application, attached as Exhibit G to this Response. 
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Miyamura inventors from the inventors of the instant application (through Dr. Houghton), and 
that this disclosure was the own work of the inventors of the instant application. See Mathews; 
Exhibits A & B hereto. Accordingly, the portions of the Miyamura patent pertaining to the 
HCV-1 ORF sequence (Figure 12), putative genomic organization of HCV-1 (Figure 1 1 and 
column 7, lines 8-9 of the Miyamura patent), and column 17, lines 17-21 of the Miyamura 
patent, have effectively been removed as prior art under 35 U.S.C. § 102(e). 

3. The Disclosure in Miyamura of Methods of Production of 
Polypeptides Encoded by the HCV Genome, Taken Alone, Does Not 
Render Claims 27-35 Obvious 

The Office Action also relies upon "Examples I-IV at columns 28-39" to contend that 
Miyamura et al. "teach preparation of cloning vectors, and transformed host cells comprising the 
vectors, comprising inserts of specific, defined, regions found anywhere in a nucleic acid 
sequence encoding all or part of an hepatitis C virus polyprotein." Office Action at 1 1 . The 
patent "explicitly teach, at cols. 8-10, that expression vectors comprising transcriptional and 
translational regulatory elements operably linked to a polynucleotide encoding a desired regions 
[sic] of the HCV polyprotein should be used to produce desired portions of the hepatitis C virus 
polyprotein in host cells." Id. Miyamura et al. "further suggest preparation of expression 
constructs providing fusions of hepatitis C virus amino acid sequence regions with proteins 
commonly used in the art as fusion partners such as j8-galactosidase and superoxide dismutase 
[SOD]." Office Action at 1 1 (bracketed text in original). 

As shown above, there is no HCV protease sequence in Miyamura that is citable as prior 
art against the instant application. Absent a prior art HCV protease sequence to express, 
disclosure concerning standard methodology for expressing polypeptides cannot by itself support 
a prima facie obviousness rejection. 

4. The Rejection Must Be Withdrawn Based on the Removal of the 
Subject Matter of Figures 11 and 12, Column 17, lines 17-21, and 
Column 7, lines 8-9, of the Miyamura Patent. 

The Miyamura patent cannot support a prima facie obviousness rejection because, as 
shown above, the key disclosures thereof are not available as prior art. Applicants submit that 

For the convenience of the Examiner a copy of the following from the '216 application is attached as Exhibit J to 
this Response: Page 1, page 9 (describing Figures 32 and 47), and Figures 32-1 to 32-7 and 47-1 to 47-8. 
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the outstanding rejection must be withdrawn in the absence of the subject matter derived from 
the inventors of the instant application. The secondary references on which the Examiner relies - 
- Gorbalenya, Bazan and Miller — do not remedy Miyamura's deficiencies. 

Gorbalenya and Bazan concern flaviviruses. Neither reference mentions HCV, and HCV 

• 1 3 

is not a flavivirus. Miller discloses only nucleic acid sequence of a helicase region within the 
NS3 domain of HCV. Miller does not disclose any sequence encoding a protease. Thus, there is 
no link between Gorbalenya, Bazan or Miller to the pending claims of the instant application. In 
view of the foregoing, none of the secondary references cited by the Examiner can support a 
prima facie obviousness rejection. 14 



13 Miyamura itself states that HCV is "a new viral class" distinct from flaviviruses. See, e.g., Miyamura '928 patent, 
column 2, lines 1-8; see also C. Rice, "Flaviviridae -- The Viruses And Their Replication", in Fields Virology . Vol. 
1 (B. Fields) (3d ed. 1995), pages 932-33, which is attached as Exhibit C to this Response. 

14 For the record, Applicants do not concede that the claims are prima facie obvious over Miyamura in view of 
Gorbalenya, Bazan and Miller even if the HCV-1 subject matter of Miyamura is legally available prior art (which it 
is not). 

21 

pa-9 12645 



Application No.: 09/884,455 



Docket No.: 223002010004 



CONCLUSION 



In light of the arguments set forth above, Applicants earnestly believe that they are 
entitled to a letters patent, and respectfully solicit the Examiner to expedite prosecution of this 
patent application to issuance. If it is determined that a telephone conference would expedite the 
prosecution of this application, the Examiner is invited to telephone the undersigned at the 
number given below. 

In the event the U.S. Patent and Trademark office determines that an extension and/or 
other relief is required, applicant petitions for any required relief including extensions of time 
and authorizes the Commissioner to charge the cost of such petitions and/or other fees due in 
connection with the filing of this document to Deposit Account No. 03-1952 referencing docket 
no. 223002010004. However, the Commissioner is not authorized to charge the cost of the issue 
fee to the Deposit Account. 

Dated: December 30, 2004 Respectfully submitted, 




Shantanu Basu 



Registration No.: 43,318 
MORRISON & FOERSTER LLP 
755 Page Mill Road 
Palo Alto, California 94304 
(650) 813-5995 
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CLASSIFICATION 
Flaviviruses 

The Flavivirus genus includes more than 68 members 
separated into groups on the basis of serological related- 
ness (37) (Table 1; see Chapter 31). More recently, simi- 
lar relationships have been found by comparison of fla- 
vivirus genome sequences (21,171). Most flaviviruses are 
arthropod-borne, being transmitted to vertebrates by chron- 
ically infected mosquito or tick vectors. However, isolates 
from bats and rodents, without known insect, vectors, also 
have been identified. Arthropod-borne flaviviruses cause 
significant human and animal disease and are distributed 
worldwide (206,289) (see also Chapter 31). Clinical symp- 
toms vary and include fever, encephalitis and hemorrhag- 
ic fever (see Chapter 31). Entities of major global concern 
include dengue fever with its associated dengue hemor- 
rhagic fever (DHF) and shock syndrome (DSS) (1 13,1 14), 
Japanese encephalitis (JE) (207), and YF. Tick-borne en- 
cephalitis (TBE), Kyasanur Forest disease, West Nile en- 
cephalitis (WN), St. Louis encephalitis (SLE), and Mur- 
ray Valley encephalitis (MVE) are other important agents 
of regional endemic or epidemic disease (206) (see Chap- 



ter 31). Thus far, vaccination is available for YF, using the 
live-attenuated 17D strain (318), and for TBE and JE using 
inactivated virus ( 1 24). 

Pestiviruses 

Currently recognized pestiviruses include three sero- 
logically related animal pathogens (203) (see Chapter 33). 
These include the type virus, bovine viral diarrhea virus 
(BVDV), classical swine fever virus (CSFV; also called 
hog cholera virus), and border disease virus (BDV) of 
sheep. The border disease group has recently been shown 
to comprise B VDV-like isolates as well as true BDV strains 
(16). Pestivirus diseases are widespread and of major eco- 
nomic importance to the livestock industry (203). Trans- 
mission occurs by direct or indirect contact as well as by 
congenital routes. Clinical manifestations vary and include 
inapparent infections, acute or persistent subclinical in- 
fections, fetal death and congenital abnormalities, wasting 
disease, and an acute fatal illness called mucosal disease 
(MD) (203). Recently, a new variant of BVDV has been 
identified that causes severe thrombocytopenia and hem- 
orrhagic syndrome in adult animals (22,63,64,255). Live- 



TABLE 1. Members of the Flaviviridae 



Group 



Type member 



Ravi viruses 



Pestiviruses 



Hepatitis C viruses' 



Tick-borne encephalitis (12 J , T 6 ) 

Rio Bravo e (6, T) 

Japanese encephalitis (10, M) 



Tyuleniy (3, T) 
Ntaya* (5, M) 
Uganda S (4, M) 
Dengue (4, M) 



Modoc (5, U) 
Ungrouped e (17, M) 
Bovine viral diarrhea 
Classical swine fever 
Border disease 
Hepatitis C 



Central European encephalitis (TBE-W) 
Far Eastern encephalitis (TBE-FE) 
Rio Bravo 

Japanese encephalitis (JE) 
Kunjin (KUN) 

Murray Valley encephalitis (MVE) 

St. Louis encephalitis (SLE) 

West Nile (WN) 

Tyuleniy 

Ntaya 

Uganda S 

Dengue type 1 (DEN1) 
Dengue type 2 (DEN2) 
Dengue type 3 (DEN3) 
Dengue type 4 (DEN4) 
Modoc 

Yellow fever (YF) 

Bovine viral diarrhea (BVDV) 

Hog cholera or classical swine fever (CSFV*) 

Border disease (BDV) 

Hepatitis C (HCV) 



•Number of recognized members in each antigenic group [from Calisher et al. (37)]. 
•Arthropod vectors: T, tick; M, mosquito; U, unidentified or no vector. 

e Arthropod vectors for some members of these groups have not been identified. The ungrouped flaviviruses include mosqui- 
to- and tick-transmitted viruses as well as some with no known vector. 

"The hepatitis C viruses, include a large number of isolates that can be divided into several groups or genotypes on the basis 
of genetic divergence (36,269,291). An official name for this genus and a standardized nomenclature for different genotypes have 
not yet been agreed upon. 

•In the pestivirus literature, HCV has been a common abbreviation for hog cholera virus. More recent publications and this 
chapter use CSFV to avoid confusion with the human hepatitis C viruses. 
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attenuated strains and inactivated virus preparations are 
available for vaccination against CSFV and BVDV (203), 
but there is need for improved pestivirus vaccines (see 
Chapter 33). 



Hepatitis C Viruses 

The hepatitis C viruses (HCV) compose the remaining 
genus of the Flaviviridae, After the development of diag- 
nostic tests for hepatitis A virus (Chapter 24) and hepati- 
tis B virus (Chapter 86), an additional agent, which could 
be experimentally transmitted to chimpanzees (4,139,309), 
became recognized as the major cause of transfusion-ac- 
quired hepatitis. The causative agent, previously designat- 
ed non-A, non-B hepatitis virus and now referred to as 
HCV, was identified in 1989 (54). Development of diag- 
nostic tests to identify HCV carriers among blood donors 
(52,162) has already markedly reduced the frequency of 
posttransfusion hepatitis (3). Humans are the only known 
natural host for HCV; there is no evidence for vector-me- 
diated transmission. HCV infection is found throughout 



the world, and the prevalence of anti-HCV antibodies ranges 
from 0.4% to 2% in most developed countries to more than 
14% in Egypt (129) (see Chapter 32). Besides transmis- 
sion via blood or blood products, or less frequently by sex- 
ual and congenital routes, sporadic cases occur that are not 
associated with known risk factors and account for more 
than 40% of HCV cases (5,194). Infections are usually 
chronic (6), and clinical outcomes (138) (see Chapter 32) 
range from an inapparent carrier state to acute hepatitis, 
chronic active hepatitis, and cirrhosis, which is strongly 
associated with the development of hepatocellular carci- 
noma (HCC) (288). Although alpha interferon has been 
shown to be useful for the treatment of some patients with 
chronic HCV infections (65,71) and subunit vaccines show 
some promise in the chimpanzee model (53), future efforts 
are needed to develop more effective therapies and vac- 
cines. The considerable diversity observed among differ- 
ent HCV isolates (36,290), the emergence of genetic vari- 
ants in chronically infected individuals (76,131,151,152, 
163,170,227,336,337), and the lack of protective immuni- 
ty elicited after HCV infection (81,245) present major chal- 
lenges toward these goals. - 
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FIG. 1. The flavivirus lifecycle. 
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Cross-Re ference to Related Applications 

10 This application is a continuation-in-part of 

attorney docket 2300-0063.28 (U.S. S.N. 07/355,002 ) ''filed 
18 May 1989; which is a continuation-in-part of attorney 
docket number 2300-0063.29 (U.S. S.N. 07/341, 334 )"filed 20 
April 1989; which is a continuation-in-part of attorney 

15 docket number 2300-0063.59 (PCT/US88/04125) filed 18 ^ 
November 1988, converted to U.S. National phase on 21 
April 1989 and assigned attorney docket number 2300- 
0063.26 (U.S. S.N. 353,896)*, and a continuation-in-part of 
attorney docket number 2300-0063.25 (U.S. S.N. 07/325,338) 

20 filed 17 March 1989 (now abandoned); which are 

continuations-in-part of attorney docket number 2300- 
0063.24 (U.S. S.N. 271,450) filed 14 November 1988, now 
abandoned; which is a continuation-in-part of attorney 
docket number 2300-0063.23 (U.S. S.N. 263 , 584 )^f iled 26 

25 October 1988, now abandoned; which is a continuation-in- 
part of attorney docket number 2300-0063.22 (formerly 
2300-0237, U.S. S.N. 191,263) filed 6 May 1988, now 
abandoned; which is a continuation-in-part of attorney 
docket number 2300-0063.21 (formerly 2300-0228, U.S. S.N. 

30 161,072) filed 26 February 1988/ now abandoned; which is < 
continuation-in-part of attorney docket number 2 300- 
0063.20 (formerly 2300-0219, U.S. S.N. 139 , 886 ) "filed 30 
December 1987, now abandoned; which is a continuation-in- 
part of attorney docket number 2300-0063 (formerly 2300- 

35 
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0203, U.S. S.N. 122,714) filed 18 November 1987, now 
abandoned; the aforementioned applications are, in their 
entirety, incorporated herein by reference. 

5 Technical Field 

The invention relates to materials and 
methodologies for managing the spread of non-A, non-B 
hepatitis virus (NANBV) infection. More specifically, it 
relates to diagnostic DNA fragments, diagnostic proteins, 
10 diagnostic antibodies and protective antigens and antibod- 
ies for an etialogic agent of NANB hepatitis, i.e., 
hepatitis C virus. 

References Cited in the Application 
15 Barr et al. (1986), Biotechniques 4:428. 

Bradley et al . (1985), Gastroenterology 88:773. 
Botstein (1979), Gene 8:17. 

Brinton, M. A. (1986) in THE VIRUSES: THE TOGAVIRIDAE AND 
FLAVI VI RI DAE (Series eds . Fraenkel-Conrat and Wagner, vol. 
20 eds. Schlesinger and Schlesinger, Plenum Press), p. 327- 
374. 

Broach (1981) in: Molecular Biology of the Yeast 

Saccharomyces, Vol. 1, p. 445, Cold Spring Harbor Press. 
Broach et al. (1983), Meth. Enz. 101:307- 
25 Catty (1988), ANTIBODIES, Volume 1: A Practical Approach 
(IRL Press) . 

Chaney et al. (1986), Cell and Molecular Genetics 12 :237 . 
Chakrabarti et al* (1985), Mol. Cell Biol. 5:3403. 
Chang et al. (1977), Nature 198 : 1056 . 
30 Chen and Seeburg (1985), DNA 4 : 165 . 

Chirgwin et al . (1979), Biochemistry 18:5294. 
Chomczynski and Sacchi (1987), Analytical Biochemistry 
162:156. 

Choo et al. ( 1989), Science 244 359 . 
35 Clewell et al. (1969), Proc. Natl, Acad. Sci. USA 62:1159. 
Clewell (1972), J. Bacteriol. 110 :667 . 
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encoded in the major ORF of the HCV genome. Also 
indicated in the figure are the possible functions of the 
flaviviral polypeptides cleaved from the flaviral 
polyprotein. In addition, the relative placements of the 
5 HCV polypeptides, NANB 5 _ 1 _ 1 and C100, with respect to the 
putative HCV polyprotein are indicated. 

Fig. 70. shows relevant characteristics of AcNPV 
transfer vectors used for high level expression of 
nonfused foreign proteins. It also shows a restriction 
10 endonuclease map of the transfer vector pAc373. 

Fig. 71 shows the nucleotide sequence of clone 
6k, the part of the sequence which overlaps clone 16 jh, 
and the amino acids encoded therein. 

Fig. 72 shows a composite cDNA sequence derived 
15 from overlapping clones clones bll4a, 18g, ag30a, CA205a, 
CA290a, CA216a, pil4a, CA167b, CA156e, CA84a, CA59a, K9-1 
(also called k9-l), 26j, 13i, 12f, 14i, lib, 7f, 7e, 8h, 
33c, 40b, 37b, 35, 36, 81, 32, 33b, 25c, 14c, 8f, 33f, 
33g, 39c, 35f, 19g, 26g, 15e, b5a, 16jh and 6k; also shown 
20 are the amino acids encoded in the positive strand of the 
cDNA (which is the equivalent of the HCV RNA) . 

Fig; 7 3 shows the linkers used in the construc- 
tion of pS3-56 01 nn . 

* ClOOm 

Fig. 74 shows the nucleotide sequence of the HCV 
25 cDNA in clone 31, the amino acids encoded therein, and 
putative restriction enzyme sites encoded therein. 

Fig. 75 shows the nucleotide sequence of the HCV 
cDNA in clone pl31jh, and its overlap with the nucleotide 
sequence in clone 6k. 
30 Fig. 76 shows a flow chart for construction of 

the expression vector pC100~d#3. 

Fig. 7 7 shows a flow chart for construction of 
the expression vector pS2d#9 . 

Fig. 78 shows a flow chart for construction of 
35 the expression vector pNSlld/13j_ 
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FIGURE 72-3 



1981 



2041 



2101 



2161 



2221 



2341 



2401 



CTTCGACGGACGTTGACCTGCGCCCCGCTTGCAACGCTAGACCTTCTGTCCCTGTCCAGG 

GAGCTCAGCCCGTTACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCTTCACA 
CTCGAGTCGGGCAATGACGACTGGTGATGTGTCACCGTCCAGGAGGGCACAAGGAAGTGT 

^CT^GCCTTGTCC^CCGGCCT^ 

TGGG ATGG TCGG AAC AGG TGG C CGG AG T AGG TGG AGG TGG TCTTG T AACACC TGCACG TC 

^^^^ lyValG1 y SerSerI1 ^ aS ^ T ^P^aIleLysTrpGluTyrValVal 
TACTTGTACGGGGTGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTT 
ATCAACATGCCCCACCCCAGTTCGTAGCGCAGGACCCGGTAATTCACCCTCATGCAGCAA 

I^^uPhel^uI^ul^uAlaAspAlaArgVal^ 

CTCCTGTTCCTTCTGCTTCCAGACGCGCGCGTCTGCTCCTGCTTCTGGATGATGCTACTC 
GAGGACAAGGAAGACGAACGTCTGCGCGCGCAGACGAGGACGAACACCTACTACGATGAG. 

ATATCCCAAGCGGAGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAGCATCCCTGGCC 
TATAGGGTTCGCCTCCGCCGAAACCTCTTGGAGCATTATGAATTACGTCGTAGGGACCGG 

CCCTCOTlXKCACA»MTS<KA^JGC»CMGA»GJCGaAAC0TACCSIS4ACITCCCA 

AACCGCAACGGGGTCGCCCGCATGCGCGACCTGTGCCTCCACCGGCGCAGCACACCGCCA 

CAACAAGAGCAGCCCAACTACCGCGACTGAGACAGTGGTATAATGTTCGCGATATAGTCG 

ACCACGAACACCACCGAAGTCATAAAAGACTGGTCTCACCTTCGCGTTGACGTGCACACC 
IleProProLeuAsnValArgGlyGlyArgAspAlaVallleLeuLeuMetCvsAlaVal 

TAAGGGGGGGAGTTGCAGGCTCCCCCCGCGCTGCGGCAGTAGAATGAGTACACACGACAT 

HisProThrLeuValPheAsplleThrLysLeuLeuLeuAlaValPheGlvProLeuTm 
^^^^^ TATTTCA ^^ C< ^ TTCCTCC TCGCCGTCTTCGGACCCCTTTGG 
GTGGGCTGAGACCATAAACTGTAGTGGTTTAACGACGACCGGCAGAAGCCTGGGGAAACC 

AAGACGCGCAATCGCGCCTTCTACTAGCCTCCGGTAATGCACGTTTACCAGTAGTAATTC 

JteuGlyAlaLeuThrGlyTh^ 
TTAGGGGCGCTTACTGGCACCTATGTTTATAACCATC^ 

AATCCCCGCG AATG ACCG TGGATACAAATATTGG TAG AG TG AGG AG AAGCCCTG ACCCGC 
S^^SSiX^^SAspLeuAl^^ 

CACAACGGCTTGCGAGATCTGGCCGTGGCTGTAGAGCCAGTCGTCTTCTCCCAAAl^ 
GTGTTGCCGAACGCTCTAGACCGGCACCGACATCTCGGTCAGCAGAAGAGGGTTTACCTC 
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2521 



2581 



2641 



2701 



2821 



2881 



FIGURE 72-4 



ThrLysLeuIleThrTrpGlyAlaAspThrAlaAlaCysGlyAspIlelleAsnGlyLeu 
2941 ACCAAGCTCATCACGTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATCAACGGCTTG 
TGGTTCGAGTAGTGCACCCCCCGTCTATGGCGGCGCACGCCACTGTAGTAGTTGCCGAAC 

ProValSerAlaArgArgGlyArgGluIleLeuLeuGlyProAlaAspGlyMetjv^^er 
3001 CCTGTT-TCCGCCCGCAGGGGCCGGG AGATACTGCTCGGGCCAGCCG ATGG AAT^GTOTCC 
GGACAAAGGCGGGCGTCCCCGGCCCTCTATGACGAGCCCGGTCGGCTACCTTACCAGAGG 

l^blyTrpArgLeuLeuAlaProIleThrAlaTyrAlaGlnGlnThrArgGlyLeuLeu 
3061 AAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTACGCCCAGCAGACAAGGGGCCTCCTA 
TTCgCCACCTCCAACGACCGCGGGTAGTGCCGCATGCGGGTCGTCTGTTCCCCGGAGGAT 

GlyCysIlelleThrSerLeuThrGlyArgAspLysAsnGlnValGluGlyGluValGln 
3121 GGGTGCATAAl^CCAGCCTAACTGGCCGGGACAAAAACCAAGTGGAGGGTGAGGTCCAG 
CCCACGTATTAGTGGTCGGATTGACCGGCCCTGTTTTTGGTTCACCTCCCACTCCAGGTC 

I ;^ 

IleValSerThrMaAlaGlnThrPheLeuAlaThrjCysIleAsnGlyValCysTrpThr 
3181 ATTGTGTCAACTGCTGCCCAAACCTTCCTG ~ 
TAACACAGTTGACGACGGGTTTGGAAGGACCGTTGCACGTAGTTACCCCACACGACCTGA 

ValTyrHisGlyAlaGlyThrArgThrlleAlaSerProLysGlyProVallleGlnMet 
3241 GTCTACCACGGGGCCGGAACGAGGACCATCGCGTCACCCAAGGGTCCTGTCATCCAGATG 
CAGATGGTGCCCCGGCCTTGCTCCTGGTAGCGCAGTGGGTTCCCAGGACAGTAGGTCTAC 

TyrThrAsnValAspGlnAspLeuValGlyTrpProAlaProGlnGlySerArgSerLeu 
3301 TATACCAATGTAGACCAAGACCTTGTGGGCTGGCCCGCTCCGCAAGGTAGCCGCTCATTG 
ATATGGTTACATCTGGTTCTGGAACACCCGACCGGGCGAGGCGTTCCATCGGCGAGTAAC 

ThrProCysThrCysGlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallle 
3361 ACACCCTGCACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGATGTCATT 
TGTGGGACGTGAACGCCGAGGAGCCTGGAAATGGACCAGTGCTCCGTGCGGCTACAGTAA 

ProValArgArgArgGlyAspSerArgGlySerLeuLeuSerProArgProlleSerTyr 
3421 CCCGTGCGCCGGCGGGGTGATAGCAGGGGCAGCCTGCTGTCGCCCCGGCCCATTTCCTAC 
GGGCACGCGGCCGCCCCACTATCGTCCCCGTCGGACGACAGCGGGGCCGGGTAAAGGATG 

LeuLysGlySexSerGlyGlyProIieuLeuCysProAlaGlyHisAlaValGlyllePhe 
3481 TTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTGCCCCGCGGGGCACGCCGTGGGCATATTT 
AACTTTCCGAGGAGCCCCCCAGGCGACAACACGGGGCGCCCCGTGCGGCACCCGTATAAA 

ArgMaMaValCysThrArgGlyValAlaLysAlaValAspPhelleProValGluAsn 
3541 AGGGCCGCGGTGTGCACCCX3TGGAGTGGCTAAGGCGGTGGACTTTATCCCTGTGGAGAAC 
TCCCGGCGCCACACGTGGGCACCTCACCGATTCCGCC^CCTGAAATAGGGACACCTCTTC 

LeuGluThrThrMetArgSerProValPheThrAspAsnSerSerProProValValPro 
3601 CTAGAGACAACC^TGAGGTCCCCGGTGTTCACGGATAACTCCTCTCCACCAGTAGTGCCC 
GATCTCTGTTGGTACTCCAGGGGCCACAAGTGCCTATTGAGGAGAGGTGGTCATCACGGG 

GlnSerPheGlnValAlaHisLeuHisAlaProThrGlySerGlyLysSerThrLysVal 
3661 CAG AGCTTCCAGGTGGCTCACCTCCATGCTCCCACAGGCAGCGGCAAAAGCACCAAGGTC 
GTCTCGAAGGTCCACCGAGTGGAGGTACGAGGGTGTCCGTCGCCGTTTTCGTGGTTCCAG 

ProAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsnProSerValAlaAla 
3721 CCGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAACCCCTCTGTTGCTGCA 
GGCCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTGGGGAGACAACGACGT 

ThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAspProAsnlleArgThr 
3781 ACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGATC 

TGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTAGGATTGTAGTCCTGG 

GlyValArgThrlleThrThrtlySerProIleThrTyrSerThrTyrGlyLysPheLeu 
3841 GGGGTGAGAACAATTACCACTGGCAGCCCCATCAOSTACTCCACCTACGGCAAGTTCCTT 
CCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCAT6AGGTGGATC 

AlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCysAspGluCysHisSer 
3901 GCCGACGGCGGG1X3CTCGGGGGGCGCTTATGACATAATAATTTGTGACGAGTGCCACTCC 
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pn.pil.oge <7^n>lMage Vpyu-p>lij\ n (ca. 1599) : the state or period 
of being a pupil 

p«p.pet \*p»-p3t\ n, often attrib [ME popet, fr. MF poupette, dim. of (as- 
sumed) poupe doll, fr. Lpupa) (1538) 1 a : a small-scale figure (as of a 
person or animal) usu. with a cloth body and hollow head that fits over 
and is moved by the hand b : marionette 2 : doll 1 3 : one 
whose acts are controlled by an outside force or influence — pup.pet- 
Uke V.KkX a<# v V 

pup-pcteer X.pp-p^-'tirX n (ca. 1923) : one who manipulates puppets 
pup-pet-ry Vpa-p9-tr€\ n, pi -ries (1528) 1 : the production or creation 
of puppets or puppet shows 2 : the art of manipulating puppets 



pup.py \ p^pe\« p/ puppies [MEpopi, fr. MF poupie doll, toy, fr. (as 

sumzd) poupe doll] (1591) : ■ ■ 

a year old — pup- 
py.like\-,llk\a<^ 



- ™ ' -u ' * v : a y° un 8 domestic dog; specif: one less than 

°, ld .~v PuP'Py'hood Vihud\ n — pup.py.ish \-ish\ adj — pup- 



pappy dog n (1595) : a domestic dog; esp : one having the lovable at- 
tributes of a puppy 

puppy love n (1834) : transitory affection felt by a boy or girl for one of 
the opposite sex ' ... 

pup tent n (1863) : a low small tent for two persons usu. consisting of 
two halves fastened together 

Pu.ra.na Xpu-'ra-naN n often cap [Skt purdna, fr. purdna ancient, fr, 
purq formerly; akin to Skt pura before, Ok para beside/pro before — 
more at for] (1696) : one of a class of Hindu sacred writings chiefly 
trom a.d. 300 to a.d. 750 comprising popular myths and legends and 
other traditional lore — Pu.ra.nlc \-nik\ adj 

purblind ypsr-.bllndV adj [ME pur blind, fr. pur purely, wholly, fr. pur 
pure] (14c) laofo: wholly blind b : partly blind 2 : lacking in vi- 
sion, insight or understanding : obtuse — pwblind.ly \-,bIIn(d)-le\ 
adv — pur.bIind.nessV 1 bUn(d>nas\/i . «w«\ 
P u J' c , h ^f Vpar-chas\ vb pur-chased; pur-chas-lng [MEpurchacen, fr. 
OF purchacier to seek to obtain, fr.por-, purr for, forward (modif. of L 
pro-) + cnacier to pursue, chase — more at pro-J vt (14c) 1 a archaic 
: gain, acquire b : to acquire (real estate) by means other than de- 

?°ifi^ 0 ^ r en i? nc \ c , V to obtain by P^ 1 ^ mone V or equivalent 
. buy d : to obtain by labor, danger, or sacrifice 2 : to apply a device 
for obtaining a mechanical advantage to (as something to be moved)- 
also : to move by a purchase 3 : to constitute the means for buying 
<our dollars ~ less each year> ~ vi : to purchase something — pur- 
chasable \-cha-s>-bal\ adj — pur*chas<er n . 
Toraase«(14c) 1 : an act or instance of purchasing 2 : something 
obtained esp. for a price in money or its equivalent 3 a (1) : a mechan- 
ical hold or advantage applied to the raising or moving of heavy bodies 
(2) : an apparatus or device by which advantage is gained b (1) : an 
advantage (as a firm hold or position) used in applying one's power 
<clutching the steering wheel for more ~ —Barry Crump) (2) - a 
means of exerting power . 

P ^t^}lF?^ n ff in<i L parda '. Ut - ^reen, veil] (186$) 1 : seclusion 
ot women from public observation among Muslims and some Hindus 
esp. in India 2 : a state of seclusion or concealment 

£y" rX ^ P^!^ [ME pur, fr. OF, fr. h purus; akin to 

OHG fowen to sift Skt punati he cleanses, Mir ur fresh, new] (14c) 1 a 

dirt !5 I !S2?/ w,th f an 3fv ot ^S5 matter <~ gold > (2) : free from dus t, 

dirt, or taint <~ food) (3) : spotless, stainless b : free from 
harshness or roughness and being in tune — used of a musical tone c 
of a vowel : characterized by no appreciable alteration of articulation 
during utterance 2 a : being thus and no other : sheer, unmitigat- 
ed folly) b (1) : abstract, THEORETICAL (2) : A priori <~ me- 
chanics) c : not directed toward exposition of reality or solution of 
practical problems literature) d : being nonobjective and to be 
appraised on formal and technical qualities only <~ form) 3 a (1) 
: free from what vitiates, weakens, or pollutes (2) : containing nothing 
that does not properly belong b : free from moral fault or guitc 
b X chast ^ : continent d (1) : of pure blood and unmixed 
ancestry (2) : homozygous m and breeding true for one or more char- 
acters e :ntually clean syn see chaste — pure»ness n 
P u £ T ^7 bl ? od - ed \ , pyur-,blxbd\ or pure-blood \-,blad\ adj (1821) 
: full-blooded 1 — pure-blood \-,blad\ n 

u \' hT ? d \ " ,bred > °^ (1868 > : bred from members of a recog- 
nized breed, strain, or kind without admixture of other blood over 
. many generations — purebred \-,bred\ n 

P ,??^ m0 ? 1 ra ^ y !L (ca * 1910 > ! democracy in which the power is exer- 
cised directly by the people rather than through representatives 

'pu.ree or pa-r^e Vpyu-'ra, -'re\ n [E purte, fr. MF, fr. fern uof plrt, pp. 
«Iff?T ZPl^' S rain ' fr * Liwraw.to.purify, fr. purus] (1707) 1 : a 
paste or thick liquid suspension usu. made from cooked food ground 
nneiy 2 : a thick soup made of pureed vegetables ' 

'puree vt pu-reed; pu.ree.lng (1928) : to make a puree of 

pure Imaginary n (1947) : a complex number that is the product of a 
now daf Cr 1 ZCr ° and - the taiagjnary.unlt — pure imagi- 

"SftS^ v Py ur " l ^.«^v (14c) ' 1 ': wholly/completely <a selection 
Dased ^_on ment) 2.: without admixture of anything injurious or 
foreign 3 : simply, merely <read ~ for relaxation) 4 : hi a chaste 
or innocent manner 

P ^/!, V £ v" f3lV ** S°f^* d; ^ ur,f ¥ J1 f \" f (^n\ WEpurfilen, fr. MF 
porfiler, fr. (assumed) VLprojflare, fr, Lpro- forward + IX filare to spin 

p^rfle^ PRO "' FILEl (14c) : to ornament the border or edges of — 

pur.ga.rlon \,par-»ga-shan\ n (14c) : the act or resultbf purging 

L purgatus,\ pp.] (15c) : purging or tending to purge 
*purgadven (1626) : a purging medicine : cathartic 
pur.ga.to.ri.al A^ar-ga-'tor-e-al, -'tor-V adj (15c) 1 : of , relating to, or 
suggestive of purgatory 2 : cleansing of sin : expiatory 

,r K ^ P^T-S^ito^.. "it6r-\ n, plMes [ME, fr. AF or ML; AF 
purgatone v f r .MLpurgatofium, fr. LL/heut. of purgatorius purging, fr. 
Lpurgare] (13c) 1 : an' intermediate state after death for expiatory pu- 
rification; specif : a place or state of punishment wherein according to 
Roman Catholic doctrine the souls of those who die in Ood's grace 
may make satisfaction for past sins and so become fit for heaven 2 i a 
place or state of temporary suffering or misery 

purge Vparft vb purged; purg-lng [ME, fr. MF purgier, fr. L purigare 
purgare to punfy, purge, fr. purus pure + -igare (akin to dgere to drive 




do) — more at actJ vt (14c) 1 a : to clear of guilt h ^ 
mora or ceremonial defilement 2 a : to cause evacuatiJ 1° f r«e h 
bowels) b ( I ) : to make free of something unwanted^ f **(*£^ 
gas) yourself of fear) (2) : to free (as a boiler) of^H- m ^fflS M 
hevc (as a steam pipe) of trapped air by bleeding c ( n . ? dlme m oT ^ >:1 
t.on or party) by a purge (2) : to get rid of <thc Ye^^taa'S 
purged} money-losing operations) — vi 1 : to ber« FS ^ 
: to have or produce frequent evacuations 3 : to ca..^ mc pu rgedS 
purg.ern e Pu rgaUo^ 2 

2 purge/i (1563) 1 11 
act or instance 



ga 
pu 



1 : something that purges; esp : purgattv„ 
...«„..vv of purging b : the removal of elements or I £ «> : a, 
,arded as undesirable and esp. as treacherous or dislova? memi *n? 

^ pur ; cX / 1 ' Pl P, uri ?r puris [Hindi puri, fr. Skt p U m\ n 
puffy_ fried wheat cake of India pura * (1839) . ( 



purifying or of being purified 
pu.ri.fi.ca.tor Vpyuro-fa-.ka-tarX n (1853) 1 : a li nen h . l 
wipe the chalice after celebration of the Eucharist 2 : one th to 



pu.ri.fi ; ca.tion \ lP yur-a-f^'ka-shDn\ n (14c) : the act or 

' Cl( 

pu.ri.fi.ca.to.ry Xpyir^l-fr-kV-.toV-g.^py^ \°£ T ! hat PUrlfl^ 

: serving, tending, or intended to purify 1 r " x °4 (l6lS 

pu.ri.fy Vpyur-a-.fTX vb -fied; -fy.ing [MEpurifien, fr. MF dum 
purtjicare, fr. L purus + -iftcare -ifyj vt,(l4c) : to make nuf^ ' fr - L 
clear from material defilement or imperfection b ■ to frIV 8 8: to 
or moral or ceremonial blemish c : to fr^e fr^ m ,.^„.: UC . c . Ir om 
• — ' vi : to grow or become pure 



;h c : to free from undesirable ^ ^ 
P« ri m v. - • , - - t - — r - • c or clean — pu.rj.fj.er V,fi(. 3 ^\ mtQt » 
Pu.rim Vpur-im, 'pyur-. -,em; pii-Vim, pyi.-, -'remX n IHeh 1 -V fl 
lots; fr. the casting of lots by Haman (Esth 9:24-26)1 i\^\ Ur ^S K 
holiday celebrated on the 14th of Adar in commemoration lu t Scv *h 
erance of the Jews from the massacre plotted by Haman thed eliv- 
pu.nne Vpyur-,en\ n [G Purin, fr. L purus pure + NL uricui „ ■ « 
uric) + G -in Mne] (1 898) 1 : a crystalline base C 5 hX£ ffiE (fr ' B 
ent of compounds of the uric-acid group 2 : a derivative of „ ■ ^ 
: a base (as adenine or guanine) that is a constituent of DNa ^i** 
pur.ism Vpy ur-.i-zamX n ( 1 803) 1 : an example of rigid aZr, n ^ 
insistence on purity or nicety esp. in use of words- esp • a wnS "J 0 w 
or sense used chiefly by purists 2 : the quality or practicJ 
ence to purity esp. in language *»«.uce of adher- 

pur.ist Vpyiir-istX n (ca. 1706) : one who adheres strictly and nft 
cessively to a tradition; esp : one preoccupied with the buritl *r 
guage and its protection from the use of foreign or altered fnL^ 
pu.ris.tic \pyu-'ris-tik\ adj - pu-ris-ri-cal-ly \-ti-k( a -)Ie\ lav ^ ~ 
pu-ri-tan Vpyur-a-t'nX n [prob. fr. LL puritas purity! (1572/ 1 , 

S^p er P f a 16th and 17th centur y Protestant group in EngiffJJ! 
New England opposing as unscriptural the ceremonial worahtoSiff 1 
prelacy of the Church of England 2 : one who practices or nS-S he 
more rigorous or professedly purer moral code than that wS ^ 

'or pur^anfsm^ ^ ° 589> ' ° f ° r re ' ating t0 pUritans ' thc ^tans, 
pu.ri.tan.i.cal \ 1 pyur-a- , ta-ni-k3l\«rfy(1607) 1 : puritan 2 ■ of reht 
\-k( 3 -)ieWv raCten2ed bV a ri8id m ° rality ~ P*ri-taS«lS 

P «A«* 1 5f 1 ' iSB ? V .Py." r ^ t ' n -.i- Z3 mN « (1573) 1 cap : the beliefs and prac- 
Uces characteristic of the Puritans 2 : strictness and austeriSeSta 
matters of religion or conduct p ' m 

pu.ri.ry Vpyur-3-te\ n [ME purete, fr. OF purete, fr. LL puritat- purito 

£i™ST£ pure] ( } : the quaIity or statc of being pure z3 

Pur.kln.je ceU \(,)p3r-'kin-je-\ n [Jan Purkinje] (ca. 1890) : any of mi- 
merous nerve cells that occupy the middle layer of the cerebSar cor- 
rfrit^Hi?"; c . hara , cten i ed i>y a . large globose body with massive den- 

l£l£ J ri^ d ° U , tWa ^^ d a single sIender axon directed inward 

f f ber " ( f a - 1890) ; any of the modified cardiac muscle fibers 
that have few nuclei, granulated central cytoplasm, and sparse peripb- 

cardium m C UP & network of cond ucting tissue in the myo- 

^url \'p 3 r(-3)l\ n [ME] (14c) 1 : gold or silver thread or wire for em- 
broidenng or edging 2 : the intertwisting of thread that knots a stitch 
usu. along an edge 3 : purl stitch 

purl v/ (1526) 1 a : to embroider with gold or silver thread b : to edge 
or border with gold or silver embroidery 2 : to knit in purl stitch ~ W 
: to do knitting m purl stitch 
^purl n [perh. of Scand origin; akin to Norw purla to ripplel (ca. 1522) 
1 : a purling or swirling stream or rill 2 : a gentle murmur or move- 
ment (as of purling water) 

4 ?!! rl .K 1 15 ? J) 1 /- EDDY ' SWIRL 2 : to make a s <>n murmuring sound 
like that of a purling stream 

pur-lieu VparK^yfi, 'par-(,)lu\ n [ME purlewe land severed from an En- 
glish royal forest by perambulation, fr. AF purali perambulation, fr. 

^ to go through, fr. pur- for, through + aler to go — more at 
rURCHASEi (15c) 1 a : an outlying or adjacent district b pi : ENVI- 
RONS, neighborhood 2 a : a frequently visited place : HAUNT bpl 
: CONFINES, BOUNDS 

pur-lin Vpar-lanX* [origin unknown] (15c) : a horizontal member in a 
roof 

pur-loin \(,)par-;ioin, 'por-A vt [ME, to put away, misappropriate, fr. AF 
purloigner, fr. OF porloigner to put off, delay, fr. por- forward + /oi>tgat 
a distance, fr. L longe, fr. longus long — more at purchasb, lono] 
U^cj : to appropriate wrongfully and often by a breach of trust syn 
see steal — pur.loin.er n 

purl stitch n [ l purf] (1885) : a knitting stitch usu. made with the yarn at 
the front of the work by inserting the right needle into the front of a 
loop on the left needle from the right, catching the yarn with the right 
needle, and bringing it through to form a new loop — compare KNIT 

STITCH 

P K *'?* n ?> y "^ n ^Pyur-^'mi-s'nN n [purine + -o- + -my cm) (1953) : an anti- 
fi. ■ 2 ? 7 5 that is obtained f rom an actinomycete (Streptomy- 
ces albomger) and is used esp. as a potent inhibitor of protein synthesis 
pur-p e Vpar-palX adj pur-pler \-p(a-)lar\; pur-plest \-p(»-)Iast\ (ME 
purpel alter, of purper, fr. OE purpuran of purple, gen. of purple pur- 
ple color, fr. L purpura, fr. Gk porphyra] (bef. 12c) 1 : regal, IMPERI- 
AL l i ot the color purple 3 a : highly rhetorical : ORNATE ° 
: marked by profanity 
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25 docket number 2300-0063.59 (U.S. S.N. 07/325,338, filed 17 
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corporated herein by reference. 

30 Technical Field 

The invention relates to materials and 
methodologies for managing the spread of non-A, non-B 
hepatitis virus (NANBV) infection. More specifically, it 
relates to diagnostic DNA fragments, diagnostic proteins, 

35 diagnostic antibodies and protective antigens and antibod- 



(U. S.S.N. 07/341,334 filed 20 April 1989), which are in- 
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Fig. 52 shows the nucleotide sequence of HCV 
cDNA in clone CA156e, the amino acids encoded therein, and 
the sequences which overlap with CA84a. 

Fig, 53 shows the nucleotide sequence of HCV 
5 cDNA in clone CA167b, the amino acids encoded therein, and 
the sequences which overlap CA156e. 

Fig, 54 shows the ORF of HCV cDNA derived from 
clones pil4a, CA167b, CA156e, CA84a, CA59a, K9-1, 12f, 
14i, lib, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81, 32, 33b, 
10 25c, 14c, 8f, 33f, 33g, 39c, 35f, 19g, 26g, and 15e. 

Fig. 55 shows the hydrophobic ity profiles of 
polyproteins encoded in HCV and in West Nile virus. 

Fig. 56 shows the nucleotide sequence of HCV 
cDNA in clone CA216a, the amino acids encoded therein, and 
15 the overlap with clone CA167b. 

Fig. 57 shows the nucleotide sequence of HCV 
cDNA in clone CA290a, the amino acids encoded therein, and 
the overlap with clone CA216a. 

Fig. 58 shows the nucleotide sequence of HCV 
20 cDNA in clone ag30a and the overlap with clone CA290a. 

Fig. 59 shows the nucleotide sequence of HCV 
cDNA in clone CA205a, and the overlap with the HCV cDNA 
sequence in clone CA290a. 

Fig. 60 shows the nucleotide sequence of HCV 
25 cDNA in clone 18g, and the overlap with the HCV cDNA 
sequence in clone ag30a. 

Fig. 61 shows the nucleotide sequence of HCV 
cDNA in clone 16jh, the amino acids encoded therein, and 
the overlap of nucleotides with the HCV cDNA sequence in 
30 clone 15e. 

Fig, 62 shows the composite sequence of the HCV 
cDNA sense strand deduced from overlapping clones bll4a, 
18g, ag30a, CA205a, CA290a, CA216a, pil4a, CA167b, CA156e, 
CA84a, CA59a, K9-1 (also called k9-l), 26 j, 13i, 12f, 14i, 
35 lib, 7f, 7e, 8h, 33c, 40b, 37b, 35, 36, 81, 32, 33b, 25c, 
14c, 8f, 33f, 33g, 39c, 35f, 19g, 26g, 15e, b5a, and 16jh. 
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20 



25 



30 



individual NS proteins in the putative Flavirus precursor 
polyprotein are fairly well-known- Moreover, these also 
coincide with observed gross fluctuations in the 
hydrophobicity profile of the polyprotein. It is 
established that NS5 of Flaviviruses encodes the virion 
polymerase, and that NS1 corresponds with a complement 
fixation antigen which has been shown to be an effective 
vaccine in animals. Recently, it has been shown that a 
flaviviral protease function resides in NS3 . Due to the 
observed similarities betwen HCV and the Flaviviruses, 
deductions concerning the approximate locations of the 
corresponding protein domains and functions in the HCV 
polyprotein are possible (See Section IV. H. 6). The 
expression of polypeptides containing these domains in a 
variety of recombinant host cells, including, for example, 
bacteria, yeast, insect, and vertebrate cells, should give 
rise to important immunological reagents which can be used 
for diagnosis, detection, and vaccines. 

Although the non-structural protein region of 
the putative polyproteins of the HCV isolate described 
herein and of Flaviviruses appears to be generally 
similar, there is less similarity between the putative 
structural regions which are towards the N-terminus . In 
this region, there is a greater divergence in sequence, 
and in addition, the hydrophobic profile of the two 
regions show less similarity. This "divergence" begins 
in the N-terminal region of the putative NS1 domain in 
HCV, and extends to the presumed N-terminus. 
Nevertheless, it is still possible to predict the 
approximate locations of the putative nucleocapsid (N- 
terminal basic domain) and E (generally hydrophobic) 
domains within the HCV polyprotein. In Section IV. H. 6., 
the predictions are based on the changes observed in the 
hydrophobic profile of the HCV polyprotein, and on a 
knowledge of the location and character of the flaviviral 
proteins. From these predictions it may be possible to 
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upon the Flavivirus model and the hydropathic character of 
the putative encoded polypeptides. However, the 
hydrophobic ity profiles (described infra), indicate that 
HCV diverges from the Flavivirus model, particularly with 
5 respect to the region upstream of NS2 . Moreover, the 
boundaries indicated are not intended to show firm 
demarcations between the putative polypeptides. 

The possible protein domains of the encoded HCV 
polyprotein, as well as the approximate boundaries, are 
10 the following: 

Putative Domain Approximate Boundary 

(amino acid nos.) 

15 c (nucleocapsid protein) 1-120 

E (Virion envelope protein(s) 120-400 

and possibly matrix (M) 
proteins 

20 

NS1 (complement fixation 400-660 
antigen?) 

NS2 (unknown function) 660-1050 

25 

NS3 (protease?) 1050-1640 

NS4 (unknown function) 1640-2000 

30 NS5 (polymerase) 2000-? end 



The expression vectors containing the cloned HCV 
cDNAs were constructed from pSODcfl, which is described in 
35 Section IV.B.l. In order to be certain that a correct 

reading frame would be achieved, three separate expression 
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SEQUENCE OP THE HCV cDHA &*N5g A rftflN»> 



(AS DEDUCED FROM OVERLAPPING cz/hirc mu./i a ~j~ ~.Z 




ACCCCAACTO^CTCTCCATTCCCCCCCT^ 

CpACTCATGTGTGCTGTACACCCGACTCTGGTATTTGACATCACCAAATOCTGCT^ 
CCTCTTCGCACCCCTTTCCMTOTTCAM 3 °°° 

CGTCCAAGGCCrrCTCCGGrrCTGCGCGTTAGCGCGGAACA'PCATCnRafifV'r imifCT 



1300 



FIGURE 62 
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CTraCACCAGTAGTGCCCCACAGCTTCCJUTCrc 

CAACCCCTCTGTTCCTGCAACACTGCCCTTTTCTC^ 
C ^TCCTAACATCAG<3AC(X3G0GTGA<1AACAATTAC^CTOT 

AGACTGCGGG<*CGAGACTCG TTCTGCTCGCCACCCC^CC^CC^CTC 
CCTCACTCT(XX:CCATCCCAACATCGAOGA<WTTG<rrCTO 

T^TTCAAACAMAAGTCCCACGAACTCCCCGCAAAOCTGGTCQCMTG^^TCAATec 
£SZ^^ A £GCGGTCTTGACGTC^ 

caxtmstqtqtcacocaoxcaqxoqatttcaqccttgaccct^ctt^cSttcacac 

AATCACGCTaXCCAGGATGCTGTCTCCCGCACT^ son FIGURE 62.1 

^'^AATTTTCCOAOOGCQTCTrrACAGOCCTCACTCATAIAGATGCCCACTTTOTAT?' 
CCAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTAC^ 5100 

CAA^CCACCXnCCATGGGCCAAC^CCCTGCXATACA^ 
AAT^CTGACCCACC<MTCAO^ 

CCTCTCAACAGGCTGCGTMTCATAGTCXWy^ 

^AC^^AAGTCCTCTACC(^AGT^^ ° 

CraCCGT^TaMCAAGGaTOAKEICa^^ 

CroCTCCAGACCCCGlCCCGTCIKHJCAGAGGm 

GTGOTTCGCTCCCCAG^^ 

AGCTGGOOCCGCCATCGGCAQTOTTGQACTQOQGAACGTCCTCATAGACATCCra 
GTATGGCGCGGGCGTGGCGGGA£2CTCTTGTGGCATTC 

CTCCACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGC 6000 
CC^GTGGTCTGTGCAGCAATACTC 6000 
GKCAIX^CCCMCTOATAC^^ 

CGTGCCGOAGAdCGATGCAGCTCCCOGCGTCACTGCCATACTCAG 
CCAGCTCCTCAGGCGACTGCACCAGTCGATAACCTOGGAGTGTACCACT^ 

GCTAAAAGCTAAGCTCMCCCACAGCOTO^ 
GATCACTGGACATGTCAAAAACGGGACGATCAGCATCGTCK 

CATGTGGAGTGOGACCTTCCCCATTAATGGCTACACCACG<X]CCCCTGTACCCCCCTTCC 
TGCGCCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCACAGCAATATGTGGA^TAAG" 6600 
GCAGGTCGGC^CTTCCACTACGTGACGGGTATCACTACTGACAATCTCAAATGCOCGTG 
CCAGG TCCCATCGCCOGAATTTTTCACAGAATTGCACGGGGTGCGCCTACATAGGTMCC 
GCCCCCCTCCAXGCCCTTCCTGCGGCX^AGGTATCATTCAGAGTA^ACTCCACGAA^ 
CCCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCAT 
CCTCACTC ATCCCTCCCATATAACACCAG AGGCGGCCGCGCCAAOTTTGGCG AGGGGATC- 6900 ' 

ACCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAAC 
TTGCACCGCTAACCATGACTCCCCTCATGCTGAGCTCATAGACGCCAACCTCC 

GCAGGAGATGGGCGGCAACATCACCAGGGTTGAGTC^GAAAACAAAGtGGXGATTCrGGA 
CTCCTTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGACATCTCCCTACCCGCAGAAAT 

CCTGCGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAA-7200 
CCCCCCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTG 
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As a bebw named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated bebw next to my name; 
I 8BJEVE I AM THE ORIGINAL FIRST AND SOLE INVENTOR frf onfy one name is Isted bebw) OR AN 
ORIGINAL FIRST ANO JOINT INVENTOR (4 more than one name is Isted betow) OF THE SUBJECT 
MATTER WHICH IS CUVJMED ANO FOR WHICH A PATENT IS SOUGHT ON THE INVOsfTTON 



ENTITLED: 



NANBV DIAGNOSTICS AND VACCINES 



the specification of which: 



(check is attached hereto; 
one) 0 was fi W 00 



5/18/89 



as 



Appication Serial No. -^S nno 



and was amended on 



(if tpp&cabte) 



I HAVE REVIEWED AND UNDERSTAND THE CONTENTS OF TVE ABOVE-IDENTIFIED SPECIFICA- 
TION, WCUJOtNG THE CLAIMS, AS AMENDED BY ANY AMB^OMENT RS=HIRED TO ABOVE; 

I ACKNOWLEDGE THE DUTY TO DISCLOSE INFORMATION WHICH IS MATffUAL TO THE EXAMI- 
NATION OF THCS APPLICATION N ACCORDANCE WTTH TTTLE 37. COOE OF FEDERAL REGULA- 
TIONS, Sea 156 (a) which states: "A duty of candor and good (aft toward the Patent and Trademark 
Office rests on the inventor, on each attorney or age* who prepares or prosecutes the appication and on 
every other IndMduaf who is substantively ir^oted in the preparation or prosecution of the appication 
and who is associated wth the inventor, with the assignee or with anyone to whom there is an ot*gatfon 
to assign the a pplcafon. Al such indMduata have a duty to dscbse to the Office information they are 
aware of which is material to the examination of the appficafton. Such information is material where there 
is a substantial Eosftood that a reasonable examiner would consider t important in decking whether to 
alow the applcafton to issue as a patort The duty is commensurate with the degree o( involvement in 
the prep ar ation or prosecution of the appfcafiorf; 

■ « 

I hereby dahn tie benefit under TUe 35, Unfted States Code, §1 20 of any Unfted States appication(s) 
isted bebw, and, insofar as the subject matter of each of the daims of this appfcation is not dscbsed in 
the prior Unfted States appication in the manner provided by the first paragraph of Trte 35, United States 
Code §112, 1 asknowtodge the duty to-dtacbse material Wormafcn as defined in TOe 37. Code of 
Federal -Regttfations, §1 56(a) set forth above which occurred between the Itng date of the prior appfca- 
tion and the national or PCT international filng date of this appicatiom 
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As to the subject matter of this appfcation which is coovron to said earfer appication. I do not know and 
do not befeve that the same was ever known or used in the United States of America before my or our 
invention thereof or patented or described in any printed publication in any country before my a our 
invention thereof or more than one year prior to said earfer appfcation, or in pubic use or on sale in the 
United States of America more than one year prior to said earfer appfcation; that said common subject 
matter has not been patented or made the subject of an inventor's certificate issued before the date of 
said earlier appfcation in any country foreign to the United States of America on an application filed by me 
or my legal representatives or assigns more than twelve months prior to said earfer appication; and that 
the earliest applcation(s) for patent or inventor's certificate on said invention filed by me or my legal 
representatives or assigns in any country foreign to the United States of America is identified below, as 
wed as afl other such applcations (if any) filed more than twelve months prior to the fifing date of this 
application: 



NA 



The priority of the earfest applications) (if any) filed within a year prior to said pending prior application is 
hereby claimed under 35 U-S.C. §119; 



As to the subject matter of this appication which is not common to said earfer appfcation, I do not know 
and do not befeve that the same was ever known or used in the United States of America before my or 
our invention thereof or patented or described in any printed pubfcation in any country before my a our 
invention thereof or more than one year prior to the date of this appication, or in pubOc use or on sale fri 
the United States of America more than one year prior to the dale of this application, and that said subject 
matter has not been patented or made the subject of an inventor's certificate issued in any country foreign 
to the United States of America on an appication filed by me or my legal representatives or assigns more 
than twelve months prior to the date of this appication, and that the earfest appfcatioo(s) tor patent or 
inventor's certificate on said subject matter filed by me or my tegatf representatives or assigns in any 
country foreign to the United States of America is identified befow, as wed as afl other such appication(s) 
(if any) filed more than twelve months prior to the filing date of this application: 
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COMBMED DECLARATION AND POWER OF ATTORNEY 
FOR CCJNnNUATON-N-PART APPUCATION 



Attorney Oocket No. 



2300^0063,28 



The priority o( the earliest application^) (If any) filed' within a year prior to this application la hereby 
claimed under 35 U,5.C.§119; 



I hereby appoint the Mowing attorneys and agent(s) to prosecute said application and to transact ad business in the Patent ' 
and Trademark Office connected therewith and to file, prosecute and to transact afl business in connection with international 
applications directed to said invention: 



William H. Benz - Reg. No. 25.952 
Mary-Elizabeth Buckles - Reg. No. 31.907 
Thomas E Cbtti - Reg. No. 21.013 
Ronald Craig Rsh - Reg. No. 28.843 
Grant 0. Green - Reg. No. 31.259 
Ronald S. Laurie - Reg. No. 25.431 
Gladys K Monroy - Reg. No. 32.430 



Kate H. Murashige - Reg. No. 29,959 
Usabeth Febt Murphy - Reg. No. 31347 
Matthew C. Rainey - Reg. No. 32,291 
Oianrw E Reed - Reg. No. 31 292 
Roberta Robins - Reg. No. 33.208 
Oebra Sheika - Reg. No. 33,309 



and: 



Robert P. Blackburn, Reg. No. 30,447 



Address all correspondence to: 
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545 MkJdlefield Road, Suite 200 

Menlo Park. California 94025-3471 



Address afl telephone calk to: Gladys H. Monroy 



at 415-327-7250. 



I hereby declare that afl statements made herein of my own knowledge are true and that afl statements made on information and 
belief are believed to be true; and further that these statements were made with the knowledge that wtttuf false statements and 
the 8<a so made are punishable byfineorirnprisonrnent,orbou^ 1001 of Title 18 of the United States Code and 
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NEW HCV ISOLATE 



Technical Field 

The present invention relates to new isolates of the viral 
class Hepatitis C, polypeptides , polynucleotides and anti-bodies 
derived therefrom, as well as the use of such polypeptides , 
polynucleotides and antibodies in assays (e.g., immunoassays, 
nucleic acid hybridization assays, etc.) and in the production of 
viral polypeptides. 

Background 

Non-A, Non-B hepatitis (NANBH) is a transmissible disease or 
family of diseases that axe believed to be viral-induced, and 
that are distinguishable from other forms of viral-associated 
liver diseases, including that caused by the known hepatitis 
viruses, i.e., hepatitis A virus (HAV) , hepatitis B virus (HBV) , 
and delta hepatitis virus (HDV) , as well as the hepatitis induced 
by cytomegalovirus (CMV) or Epstein-Barr virus (EBV) . NANBH was 
first identified in transfused individuals. Transmission from 
man to chimpanzee and serial passage in chimpanzees provided 
evidence that NANBH is due to a transmissible infectious agent or 
agents. Epidemiologic evidence is suggestive that there may be 
three types of NANBH: the water-borne epidemic type; the blood or 
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NANBV and/or BB-NANBV from the class of the prototype isolate, 
HCV1, described by Houghton et al. See, e.g., EPO Pub. No. 
318,216 and U.S. Patent App. Serial No. 355,002, filed 19 May 
1989 (available in non-U. S. applications claiming priority 
therefrom), the disclosures of which are incorporated herein by 
reference. The nucleotide sequence and putative amino acid 
sequence of HCV1 is shown in Figure 6. The terms HCV, NANBV, and 
BB-NANBV are used interchangeably herein. As an extension of 
this terminology, the disease caused by HCV, formerly called NANB 
hepatitis (NANBH), is called hepatitis C. The terms NANBH and 
hepatitis C may be used interchangeably herein. The term "HCV", 
as used herein, denotes a viral species of which pathogenic 
strains cause NANBH, as well as attenuated strains or defective 
interfering particles derived therefrom. 

HCV is a Flavi-like virus. The morphology and composition ' 
of Flavivirus particles are known, and are discussed by Brinton 
(1986) THE VIRUSES : THE TOGAVIRIDAE AND FLAVIVIRIDAE (Series eds. 

Fraenkel-Conrat and Wagner, vol eds. Schlesinger and Schlesinger, 
Plenum Press), p. 327-374. Generally, with respect to morphology, 
Flaviviruses contain a central nucleocapsid surrounded by a lipid 
bilayer. Virions are spherical and have a diameter of about 
40-50 nm. Their cores are about 25-30 nm in diameter. Along the 
outer surface of the virion envelope are projections that are 
about 5-10 nm long with terminal knobs about 2 nm in diameter. 

The HCV genome is comprised of RNA. It is known that RNA 
containing viruses have relatively high rates of spontaneous 
mutation, i.e., reportedly on the order of 10 to 10 per 
incorporated nucleotide. Therefore, there are multiple strains, 
which may be virulent or avirulent, within the HCV class or 
species . 
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«™f : We I™** ** P res «"* of hepatitis C virus (HCV) infection in liver 

£T„o V5 4 T UB ? P < TT ma <HCC) patients wh0 ^ antibodies to HCV 
SSI? f 1 Otitis B virus infection by me sensitive averse, 

transcriptzon/polymerase chain reaction (R/PCR) method. The primers used were 
denved from the non-structural <NS) 8 and/or the structural (cKrioT 
AmpLfxed cDNA sequences of HCV were detected in either cancerous or not 
cancerous portion of hver tissues from four out of eight HCC patients with 
primers of NS3 region. Similar but less efficient resSts wer* obtefnel w th 

SSS? a ?T re8Ult8 indicate HCV ****** in the liver tissue 

HCC i di P0S d P ersistellt Action of HCV for the development of 
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The cDNA of the HCV genome was recently cloned and identified (1) An 
assay system for circulating antibodies to HCV was developed by use of an HCV 
antigen synthesized in recombinant yeast (2). The result* obtained from this 
assay have shown that HCV is the major causative agent of transfusion- 

antibodies among HCC patients who had no serological markers for hepatitis B 
™ infectlon (non ' B > A close association between HCV infection eld 
development of HCC has been suggested (5,6). Furthermore, HCC developed in a 

TV 0f taan containingnon-A,non-Bhe P atitis 

agent (7) To study the mechanism of development of HCC, it is necessary to 
examine the hver tissues of HCC patients for HCV. Recently. cDNA fragnJnl of 
the HCV genome were cloned from Japanese HCV carriers (8,9). A comparison in 
nucleotide sequences of Japanese and USA isolates has revealed that there are 
some heterogeneities of tfa, viral genome between the two isolates. With the 
nucleot.de sequence available, it is now possible to detect the genome of HCV in 
olood or hver tissues of infected patients (10). 

Liver tissues were obtained from autopsy samples of eight non-B HCC 
patients who had antibodies to HCV. Cancerous and non-cancerous portions were 
taken from each patient and confirmed histologically. RNA was extracted by the 
guanidine thiocyanate/cesium fluorotriacetate method (11) from each tissue 
sample. About 4 pg of RNA was used for cDNA synthesis with 10 units of 
revere transcriptase (Bio-Rad, Richmond, CA) and with an antisense primer 
PCR !^° TA f AGACA CTTCCACAT-37. The cDNA was am^eTby 
PCR (12) after addition of sense primer, J469S rs> 
GTCACTCAGACGGTCGATTT-3'), encompassing the 440 base p^ (bp) of l" 
non-structuralp ro temre g ion3(NS)a Sp reviouslyde 5 cribed(8,9). Anothersetof 
primers, J135A (S'-ACAGCTTGTGGGATCCGGAG-S*) and TlfiTs* 

the 440 bp of the structural 
protein region [C/E: putative core and envelope regions (18), K. Takeuchi et al • 
manuscript submitted] was also used. Thirty cycles of PCR were carried out as 
follows: denaturation for 1.5 min at 95 C, annealing of primers for 1.5 min at 55 
C and extension for 2 min at 70 C. The amplified cDNA fragment was 
electrophoresed on a 2% agarose gel and then transferred to a nylon membrane by 
a vacuum blotting S ystem (LKB, Bromma, Sweden). A specific signal was 
dentifeid by Southern blot hybridization under the stringent condition with a 
^-labeled probe encompassing target regions of NS3 or C/E. The probe in the 
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Fig. 1. Detection of cDNA sequences of the HCV genome from liver 
tissues of HCCs. RNAs were prepared from cancerous (T) and non- 
cancerous (N) portions of autopsy samples of HCC patients who had 
antibodies to HCV. Products of R/PCR using either NS3 or C/E primers 
were eiectrophoresed on a 2% agarose gel, and analyzed by Southern 
blot hybridization with a probe encompassing either NS3 (A) or C/E 
region (B). HCC patients were indicated by numbers 1-8. The arrow 
indicates the position of the amplified 440 bp cDNA fragment of NSS or 
C/E region. 
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Hepatitis C virus shares amino acid sequence similarity with 
pestivii uses and flaviviruses as well as members of two 
plant virus supergroups 

(non-A, non-B he pat (tis/potyvir us/ carmovirus/picomavims /alpha virus) 

Roger H. Miller and Robert H. Purcell 
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Contributed by Robert H. Purcell, December 27, 1989 

ABSTRACT Hepatitis C virus (HCV) is an important 
human pathogen that is associated with transfusion-related 
non-A, non-B hepatitis. Recently, HCV cDNA was cloned and 
the nucleotide sequence of approximately three-quarters of the 
virus genome was determined. A region of the predicted 
polyprotein sequence was found to share similarity with a 
nonstructural protein encoded by dengue virus, a member of 
the flavivirus family. We report here that HCV shares an even 
greater degree of protein sequence similarity with members of 
the pestivirus group (i.e., bovine viral diarrhea vims and hog 
cholera virus), which are thought to be distantly related to the 
fiaviviruses. In addition, we find that HCV shares significant 
protein sequence similarity with the polyproteins encoded by 
members of the picornavirus-like and alphavirus-like plant 
virus supergroups. These data suggest that HCV may be 
evolutionary related to both plant and animal viruses. 



In recent years non-A, non-B (NANB) hepatitis has become 
the most common form of posttransfusion hepatitis (for 
reviews, see refs. 1-4). Although first discovered over a 
decade ago the etiological agent has remained elusive (5, 6). 
Studies involving the experimental inoculation of chimpan- 
zees provided evidence that the infectious agent was a 
lipid-containing virus 30-60 nm in diameter bearing strong 
resemblance to members of the Togaviridae family (7-il). 
Since titer of the virus in serum rarely reaches 10 6 chimpan- 
zee infectious doses in patients, or experimentally infected 
animals, additional research has been difficult. 

Recently, a Agtll library was constructed with cDNA 
synthesized from the RNA of the putative etiological agent of 
NANB hepatitis (12). Protein synthesized by a specific 
recombinant reacted exclusively with sera from NANB pa- 
tients (13). Molecular hybridization analysis demonstrated 
that the etiological agent, termed hepatitis C virus (HCV), is 
an RNA virus with a genome size of ~10 kilobases. The 
sequence of nearly three-quarters of the virus genome has 
been reported (14). Analysis indicates that the virus genome 
is of the plus, or message sense, polarity and appears to lack 
a poly(A) tail at its 3' end. The virus genome encodes a single 
polyprotein, a portion of which shares amino acid sequence 
similarity with the nonstructural number 3 (NS3) protein of 
dengue type 2 virus, a member of the flavi virus family. 
Additional computer-assisted protein analysis, presented 
here, demonstrates that HCV shares sequence similarity with 
the polyproteins of animal pesti viruses as well as those of the 
carmovirus and poty virus families of plant viruses. 

The publication costs of this article were defrayed in part by page charge 
payment. This article must therefore be hereby marked "advertisement" 
in accordance with 18 U.S.C. §1734 solely to indicate this fact. 



MATERIALS AND METHODS 

Computer Analysis. Computer analysis was through the 
BIONET National Computer Resource for Molecular Biol- 
ogy. The program fasta (15) was used to search the Euro- 
pean Molecular Biology Organization (EMBO) and GenBank 
nucleotide data bases and the Swiss (SWS) and National 
Biomedical Research Foundation (NBRF) protein data bases 
for sequences with similarity to HCV .sequences, fasta, a 
derivative of the fastp program that can be used for both 
nucleotide and amino acid data base searches, allows multi- 
pie regions of similarity between two sequences to be joined 
to determine a maximum alignment. Briefly, for a protein 
data base search, an initial similarity score is calculated based 
on a parameter that determines how many consecutive iden- 
tities are required in a match and on the total number of 
identical and similar amino acids as specified by the PAM-250 
matrix (16). Next, the fasta program determines whether 
several regions with high initial similarity values can be 
aligned. If so, the program produces an optimal similarity 
score. There are several limitations imposed when using this 
program on bionet. One is that only data base files, and not 
individual user files, can be analyzed. The second limitation 
is that only one scoring matrix (i.e., the PAM-250 matrix) can 
be used for the analysis. Within the fasta program is a 
program rdf2 that evaluates the statistical significance of 
similarity scores by calculating a mean value and the standard 
deviation from the mean for the similarity scores of se- 
quences in the data base. In this study, a stringent cutoff 
value for significance of 3:20% amino acid identity in a: 100 
residues was also incorporated. Values cited in the text are 
given as optimized similarity scores with accompanying 
standard deviation units above the mean calculated for each 
data base search. 

Three programs were used to determine regions of amino 
acid similarity considering only identical matches in the 
scoring matrix (17-19). The program homology was used to 
search for local regions of identity. Residues occurring in the 
alignments are cited in the text along with the probability that 
the matches occurred due to chance (e.g., P = 0.05 signifies 
that there is a 5% chance that the same match could occur 
between random sequences of the same size). The program 
align was used to determine the similarity over longer 
protein domains that encompassed regions with statistically 
significant matches of identical amino acids. The calculated 
value i/max is directly proportional to the degree of similarity 
between two sequences over a region of defined size. It 
should be noted that H max scores produced by the alignment 
of random sequences range from 20 to 25 for sequences of 190 
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amino acids using the default parameter settings of the 
program and a segment size of 195 amino acids. Finally, the 
program oenalign was used for multiple sequence align- 
ment. 



RESULTS 

Houghton et al (14) have reported the nucleotide sequence 
of approximately three-quarters of the HCV genome. The 
predicted polyprotein sequence, translated from the NS 
protein region of the HCV genome, is 2416 amino acids long. 
Analysis by Houghton and coworkers revealed that, among 
the virus sequences examined, the polyprotein sequence of 
HCV was most similar to that of a flavivirus. They reported 
a similarity between a 530-amino acid domain of the HCV 
polyprotein sequence and the NS3 protein sequence of den- 
gue virus. We were intrigued by the uniqueness of the HCV 
sequence and performed searches using several programs to 
identify global or local regions of significant similarity be- 
tween HCV and other sequences. This was of special interest 
since the nucleotide sequences of two pesti virus genomes, 
bovine viral diarrhea virus (20) and hog cholera virus (21), 
were determined recently. 

First, we used computer-assisted nucleotide sequence 
analysis to look for similarity between HCV and any se- 
quence recorded in the data base files. Computer searches 
conducted using the program pasta with the HCV RNA 
genome as the query sequence did not result in a statistically 
significant match with nucleotide sequences in the EMBO or 
GenBank data bases. These results are in agreement with 
those of Houghton and coworkers (14). Thus, we conclude 
that the genome of HCV is not closely related to that of any 
known RNA virus. 

Next, data base searches using the fasta program and the 
PAM-250 matrix of Dayhoff (16) were performed to detect 
protein sequences possessing significant global similarity to 
the HCV polyprotein. HCV query sequences used were the 
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complete 2416-amino acid polyprotein sequence, as well as 
the N terminus (i.e., residues 1-1299), and the C terminus 
(i.e., residues 1200-2416) of the reported HCV genome poly- 
protein. Searches were conducted using both the SWS and 
NBRF protein sequence data bases. The fasta search of the 
NBRF data base using the entire 2416-residue HCV sequence 
produced one statistically significant alignment. We found that 
the amino acid sequence of HCV shared 20.6% amino acid 
identity with the dengue type 2 virus (22) NS3 protein over a 
618-amino acid domain that encompassed the 530-amino acid 
region of similarity reported by Houghton et al (14). In 
addition to the 141 matches between identical amino acids, 
there were 262 amino acids matched by the PAM-250 matrix 
for a total similarity of 60%, The optimized similarity score of 
137 was 11.6 SD units away from the mean value of the 
analysis. The search of the SWS data base using the 2416- 
residue HCV polyprotein did not produce a statistically sig- 
nificant alignment. Therefore, using the 2416-amino acid se- 
quence as the query sequence only one alignment score was 
statistically significant in our analysis . 

The fasta search of both the NBRF and SWS data bases 
with the N terminus of the HCV polyprotein as the query 
sequence yielded an alignment that was identical to the one 
described above. The fasta search of the two data bases 
using the C terminus of the HCV polyprotein as the query 
sequence produced unexpected results. A statistically signif- 
icant alignment was identified between residues 2058 and 
2380 of the HCV polyprotein and the putative replicase of 
carnation mottle virus (CARMV), a member of the carmo- 
virus group of plant viruses (23). Over a domain of 331 amino 
acids 67 (20%) of the residues were identical and 126 (38%) 
were scored as similar by the PAM-250 matrix for a total 
similarity of 58% (Fig. 1). The optimized similarity score of 
the alignment was 140, which was 11 SD units above the mean 
score of the search. Overall, the HCV polyprotein was found 
to possess significant global similarity to only two sequences 
in the protein data bases. 
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Fig, 1, Alignment of the 
HCV polyprotein sequence (sin- 
gle-letter code) with the putative 
replicase of CARMV. Residues 
2058-2380 of the predicted ge- 
nome polyprotein of HCV (14) 
are aligned with residues 356- 
671 of CARMV (23) that are 
thought to represent the se- 
quences specifying the virus 
replicase. Identical amino acid 
matches are connected with a 
solid line, while matches scored 
as similar by the PAM-250 ma- 
trix are connected with a colon. 
Dashes represent spaces be- 
tween adjacent amino acids that 
have been inserted to optimize 
the alignment. Asterisks high- 
light the six amino acids that 
have been shown to be invariant 
among RNA virus replicases 
(24). 



Evolution: Miller and Purcell 

Next, we used several programs to determine whether the 
HCV polyprotein shared local regions of similarity with other 
virus sequences scoring only identical amino acid matches. 
Analysis using the program homology revealed the pres- 
ence of statistically significant amino acid matches between 
HCV and two pestivirus polyprotein sequences. For exam- 
ple, the HCV sequences VVLATATPPGSVT (residues 874- 
886) and QRRGRTGRGKPGIYR (residues 1016-1030) were 
statistically similar to the bovine viral diarrhea virus (20) 
sequences VVAMTATPAGSVT (residues 2043-2055) and 
QRRGRVGRVKPGRYYR (residues 2199-2214) at the P = 
0.007 and 0.0005 levels, respectively. For reference pur- 
poses, we term the former HCV sequence region A and the 
latter HCV sequence region B. Similar findings were ob- 
tained when analyzing the hog cholera virus protein sequence 
(21). HCV regions A and B were also found to be similar to 
flavi virus and plant poty virus polyprotein sequences; how- 
ever, no such similarity was detected by comparing HCV to 
alphavirus, rubivirus, or picomavirus protein sequences. For 
example, the HCV sequence TATPPGS (residues 878-884) 
in region A was found to be identical to the dengue type 4 
virus (25) sequence TATPPGS (residues 1796-1802), which is 
a statistically significant match at the P = 0.044 level. This 
sequence alignment was also present in the global alignment 
of Houghton and coworkers (14) and in our alignment using 
the program fasta as described above. In addition, the HCV 
sequence LVVLATATPPG (residues 873-883) of region A 
was significantly similar to the tickborae encephalitis virus 
NS3 sequence (26) LVLMTATPPG (residues 1806-1815) at 
theP = 0.019 level of significance. Significant similarity was 
also found between HCV sequence region B and a plant 
poty virus protein sequence. Specifically, the HCV sequence 
QRRGRTGRGKPG (residues 1016-1027) was similar to the 
sequence QRFGRVGRNKPG (residues 1463-1474) of the 
tobacco vein mottling virus (27) at the P = 0.018 level of 
significance. Overall, two regions of the "NS3-like" region of 
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Table 1. H max similarity va lues 
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The following virus sequences were used in the analysis: HCV 
Si ); ™ J?» ^ g cholera virus (^J; BVD, bovine viral diarrhea virus 
(23); TBE, uckborne encephalitis virus (25); JEV, Japanese enceph- 

<Y,v£ (28); YFV » yelIow fever virus <W; DEN, dengue vims 
£5); WNF, West Nile fever virus (30); KUN, Kunjin virus (31); 
TVM, tobacco vein mottling vims (26). 

the HCV polyprotein were found to share sequence similarity 
with pestivirus, flavivirus, and potyvirus proteins. 

To determine the degree of relatedness among HCV and 
the proteins of the pesti-, flavi-, and potyviruses, we used 
several programs to analyze a 190-residue domain encom- 
passing HCV regions A and B. In the program align, the 
calculated value is directly proportional to the degree of 
similarity between two sequences oyer a region of defined 
size. The analysis indicated that the 190-amino acid region of 
HCV was most similar to that of bovine viral diarrhea virus 
(#max = 52), hog cholera virus (// max = 51), and tobacco vein 
mottling virus (i/ max = 47). Interestingly, HCV shared more 
similarity with the potyvirus sequence than it did with any of 
the flavivirus sequences (H mhX = 33-41) examined (Table 1). 
Multiple sequence alignment of these four sequences using 
the program genalign demonstrates that there are 25 amino 
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n«££™; A ^ u,t,p,e sequence alignment of a conserved domain in the genome proteins (single-letter code) of HCV oestiviruses and a nlunt 
^tyviros.Alignmentofthe foUowing regions of the genome polyproteins of four viruses are shown: HCV, residues ^ST^ Rvn 
residues 2025-2196 of bovine diarrhea virus (20); HOG, residues 1886-2057 of hog cholera ^^^TvS^^nuAm^^. 
mottlmg v,rus (27), Identically matched amino acids between two or more vims^rS T^^'^M^Ji^^T^ 
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acids that are invariant among these diverse virus proteins 
(Fig. 2). Thus, it is likely that this region was conserved in 
evolution because the protein has an important biological 
function in virus replication or gene expression. 

DISCUSSION 

In this study, we used computer-assisted protein analysis to 
search for sequences with significant similarity to the HCV 
polyprotein. To identify sequences sharing global similarity, 
we used a data base searching program that incorporated the 
PAM-250 matrix to produce alignments consisting of identi- 
cal and similar amino acid matches. The analysis revealed 
that the HCV polyprotein possessed statistically significant 
similarity to only two sequences in the protein data bases. 
Both sequences were viral in origin. First, the NS3 protein of 
dengue type 2 virus aligned with a 618-residue domain located 
near the N terminus of the HCV polyprotein. This represents 
an extension of nearly 100 amino acids over an alignment 
reported by Houghton and coworkers (14) that spanned 530 
residues within the same region. Second, the putative repli- 
case of CARMV aligned with a region at the C terminus of the 
HCV polyprotein. This finding was unexpected since 
CARMV, a member of the carmovirus family , is a plant virus . 
Overall, the polyprotein of HCV was found to share global 
similarity with protein sequences encoded by RNA viruses of 
both animals and plants, which adds support to the hypoth- 
esis that there is an evolutionary relationship between these 
two virus groups. 

Analysis in which programs were used to search for regions 
of local identity of amino acids revealed that regions of the 
HCV polyprotein aligned with the NS3 protein sequence of 



flaviviruses and with corresponding regions of the polypro- 
teins of pestiviruses and plant potyviruses. The similarity was 
the greatest between HCV and pestiviruses. The reason that 
this similarity was not detected by others previously, or in 
our data base searches, was that the pestivirus sequences 
were published only recently and were not in the data bases 
for analysis. (Therefore, we analyzed the sequences from 
user files that we created.) Unexpectedly, we did not find 
significant similarity between the HCV genome protein se- 
quence and the putative replicase of the flaviviruses or 
pestiviruses. 

Comparative analysis of the polyproteins of the members 
of the flavivirus family reveals that the sequences of the NS 
proteins are highly conserved (Fig. 3). Multiple sequence 
alignment of the predicted polyprotein sequences of Japanese 
encephalitis (28), yellow fever (29); West Nile (30), Kunjin 
(31), tickborne encephalitis (26), and three dengue virus 
isolates (25, 32, 33) demonstrates that there are several 
regions of high amino acid conservation. Within the consen- 
sus polyprotein sequence of ~3400 amino acids there are 21 
domains that possess 5 or more consecutive amino acids that 
are identical in every flavivirus sequence (unpublished data). 
Eight of these domains are located in the NS3 protein 
sequence. The 190-amino acid domain of NS3 that shares 
sequence similarity with HCV contains 3 of these conserved 
domains. The first is a 7-residue sequence MTATPPG found 
at the N terminus of the domain. The second is a 5-residue 
sequence EMGAN near the C terminus. The third is an 
8-residue sequence SAAQRRGR located at the extreme C 
terminus of the domain. Regarding the latter sequence, 
although the next 3' residue is variable among flavivirus 
sequences the following 2 residues are always GR. Our 



40 





1 


1 


I 


I 


I 


I 


I 


c |m| 


E 


NS1 


NS2A ^S2o| 


NS3 


NS4A 4S4e{ 


NS5 





CO 

o 

o 
< 

o 



30 



20 



3 



LU 

Q 



10 




1000 



1600 2000 

MAP POSITION 



2600 



3000 



3500 



Fio. 3. Histogram of invariant amino acids in the genome polyprotein of the flaviviruses. The program o en align was used to align the amino 
acids of the following flaviviruses: three isolates of dengue virus (25, 32, 33), Kunjin virus (31), Japanese encephalitis virus (28), tickborne 
encephalitis virus (26), West Nile virus (30), and yellow fever virus (29). The number of identical amino acids at each position for all 8 sequences, 
within a block of 50 contiguous residues, is plotted against the position of the residues on the consensus genome polyprotein. The insertion of 
gaps to optimize the alignment resulted in a total length of the consensus sequence that was longer than arty of the individual polyproteins. The 
gene order of the polyprotein is shown at the top illustrating the position of the structural proteins [i.e., the capsid (C), matrix (M), and envelope 
(E) proteins] and the NS proteins. The open box under the NS3 protein heading depicts the 190-amino acid domain that shares sequence similarity 
with regions A and B of the HCV polyprotein. The asterisk represents the position of the invariant GDD moiety of RNA virus replicases. 
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analysis indicates that only the first and third domains share 
significant similarity to HCV in the regions of the polyprotein 
sequence that we have termed A and B. 

The NS3 gene region of flaviviruses may encode a protein 
with several enzymatic activities. First, the N terminus of the 
NS3 protein is known to share sequence similarity with serine 
proteases (34). Second, the central domain of NS3 of both 
flaviviruses and plant potyviruses has been shown to share 
sequence similarity with helicase-like nucleoside triphos- 
phate binding (NTB) proteins from eukaryotic and prokary- 
otic cells (35). We find that HCV also shares similarity to 
NTB proteins in regions A and B of the polyprotein sequence 
(unpublished data). Thus, it is possible that fiavi-, poty- and 
pesUviruses, as well as HCV, encode a NTB protein that has 
been conserved in evolution because of its important cata- 
lytjc function in virus gene expression 6r replication. 

The NS5 protein has the most highly conserved amino acid 
sequence of any of the flavivirus proteins and is thought to 
encode the virus replicase. Within NS5 there are 10 domains 
that contain s=5 consecutive identical amino acids including 
the longest tract of invariant residues (i.e., 14 amino acids) 
identified in the alignment of the poiyproteins. In addition, all 
flavivirus NS5 proteins possess the 6-amino acid residues 
that are known to be invariant among RNA polymerase 
sequences (24). Despite the fact that NS5 is more highly 
conserved than NS3, we found that there was no statistically 
significant similarity between the flavivirus NS5 protein and 

X P °' y?rotein using gtofcal or local alignment pro- 
grams. The only sequence that possessed statistically signif- 
icant similarity with a region at the C terminus ofthe'H(3V 
polyprotein sequence was the putative replicase of € ARMV 
Jiefefore, the HCV replicase may be most closely related to 
that of a plant virus. 

Overall, we find that HCV sequences share significant 
similarity with proteins from members of two lihreiated plant 
virus families. RNA viruses 6f plants have been assigned to 
two supergroups based on the similarity of their genome and 
protein sequences to either the picorna- or the alphaviruses 
ot animals. The picornavirus supergroup consists 6f the 
^"l?"' .?f po ~' and P ot y vir iises, while the alphavirua, or 
iundbis-like, supergroup consists of the alfalfa mosaic, ilar- 
bromo-, cucumo-, tobamo-, potex-, tobra-, furo-, nordei-' 
tombus-, and earmovirus groups (36). There is some specu- 
lation that the tombusviruses and carmoviruses may belong 
to a third supergroup because of their unusually small ge- 
nome Size. The genome of the latter virus group is MOW 
nucleotides and does not encode an NS3-like protein. Our 
analysis indicates that amino acid sequences near the N 
terminus of the HCV polyprotein are similar to those of the 
potyviruses, while amino acid sequences near the C terminus 
of the HCV polyprotein are most similar to those of the 
carmoviruses. Thus, it is possible that HCV represents a 
recombinant virus possessing an N terminus derived from a 
picornavirus-like ancestor and a C terminus derived from an 
alphavirus-like ancestor. HoweVef , it is clear that HCV is not 
closely related to any of these RNA virus families or any 
other RNA virus family thus far described. 

In conclusion, taxonomic classification of HCV must await 
analysis of the complete nucleotide sequence, which includes 
the genes encoding the structural proteins as well $s the 5' 

fu «,?£ nw \ ding re $! ons - The da ta presented here suggest 
that HCV is distantly related to the pestiviruses and flavivi- 
ruses of animals and to members of two plant virus super- 
groups. It is possible that HCV is a recombinant virus since 
RNA recombination has been demonstrated for positive- 
strand (37) and negative-strand RNA viruses (38). Another 
possibility is that a sihgle virus gave rise to HCV and these 
similar viruses. Thus, HCV may represent an evolutionary 
lirik between the plant virus supergroups and between viruses 
infecting both plants and animals. 
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Hepatitis C Virus NS3 Serine Proteinase: /raw-Cleavage 

Requirements and Processing Kinetics 
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The hepatitis C virus H strain (HCV-H) polyprotein is cleaved to produce at least 10 distinct products, in 
***** of ^-OEl-p P 7^NS3-NS4A.NS4B-NS5A-NS5B^OOH. An HCV-encoded serine proteinase 
cV%£? w \L. required for cleavage at four sites in the nonstructural region (3/4A, 4A/4B, 4B/5A, and 
5A/5B). In this report, the HCV-H serine proteinase domain (the N-terminal 181 residues of NS3) was tested 
Tor Its ability to mediate taz/u-processing at these four sites. By using an NS3-5B substrate with an inactivated 
serine proteinase domain, /row-cleavage was observed at all sites except for the 3/4A site. Deletion of the 
inactive proteinase domain led to efficient Jra/u-processing at the 3/4A site. Smaller NS4A-4B and NS5A-5B 
substrates were processed efficiently in trans; however, cleavage of an NS4B-5A substrate occurred only when 
the serine proteinase domain was coexpressed with NS4A. Only the N-terminal 35 amino adds of NS4A were 
required for this activity. TTius, while NS4A appears to be absolutely required for Irww-cleavage at the 4B/5A 
site, it is not an essential cofactor for serine proteinase activity. To begin to examine the conservation (or 
divergence) of serine proteinase-substrate interactions during HCV evolution, we demonstrated that similar 
taw-processing occurred when the proteinase domains and substrates were derived from two different HCV 
subtypes. TTiese results are encouraging for the development of broadly effective HCV serine proteinase 
inhibitors as antiviral agents. Finally, the kinetics of processing in the nonstructural region was examined by 
pulse-chase analysis. NS3-containing precursors were absent, indicating that the 2/3 and 3/4A cleavages occur 
rapidly. In contrast, processing of the NS4A-5B region appeared to involve multiple pathways, and significant 
quantities of various polyprotein intermediates were observed. NS5B, the putative RNA polymerase, was found 
to be significantly less stable than the other mature cleavage products. This instability appeared to be an 
inherent property of NS5B and did not depend on expression of other viral polypeptides, including the 
HCV-encoded proteinases. 



Hepatitis C viruses (HCVs) have recently been recognized 
as. agents of the parentally transmitted form of non-A, non-B 
hepatitis (17, 41). Virtual elimination of H<^-contaminated 
blood has greatly reduced the incidence of posttransfusion 
hepatitis; however, HCV remains responsible for a significant 
proportion of community-acquired hepatitis (1). In most cases, 
HCV is not cleared and establishes a chronic infection that can 
be associated with chronic hepatitis and more severe liver 
disease such as cirrhosis and hepatocellular carcinoma (63). 
For these reasons, there is considerable interest in developing 
additional HCV-specific antiviral agents that can complement 
currently available alpha interferon therapy, which effectively 
controls disease in only a minority of HCV-infected patients. 

At least 15 full-length HCV genome sequences, as well as 
partial sequences for many other isolates, have been reported 
(see reference 60 and citations therein). These data indicate 
the existence of multiple genotypes that can diverge by as much 
as 50% at the amino acid level (10, 64, 65). This group of 
related viruses is now classified as a separate genus in the 
family Flaviviridae (27), which includes two other genera, 
Flavivirus (12) and Pestivirus (20). The positive-strand HCV 
genome RNA is approximately 9.4 kb in length and contains a 
highly conserved 5' noncoding region followed by a long open 
reading frame encoding a polyprotein of 3,010 to 3,033 amino 
acids (36, 51). Because a cell culture system supporting effi- 
cient HCV replication is lacking, efforts to define potential 
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HCV-encoded polypeptides have utilized expression of HCV 
cDNA in cell-free translation and in cell cultures. The HCV 
polyprotein appears to be cleaved at multiple sites to produce 
at least 10 structural and nonstructural (NS) proteins (47). The 
order and nomenclature of these cleavage products for the 
HCV H strain (HCV-H) are NH 2 -C-El-E2-p7-NS2-NS3- 
NS4A-NS4B-NS5A-NS5B-COOH, where C, El, and E2 are 
putative structural proteins and the remaining NS proteins are 
believed to be replicase components (30-32, 47). Host signal 
peptidase in the endoplasmic reticulum lumen appears to 
catalyze cleavages in the structural-NS2 region (C/El, E1/E2, 
E2/p7, and p7/NS2 sites) (33, 47), whereas an HCV-encoded 
serine proteinase located in the N-terminal one-third of the 
NS3 protein is responsible for four cleavages in the NS region 
(3/4A, 4A/4B, 4B/5A, and 5A/5B sites) (5, 22, 30, 34, 50, 69). 
Autocatalytic cleavage at the 2/3 site is mediated by a second 
HCV-encoded proteinase that encompasses the NS2 region 
and the NS3 serine proteinase domain (31, 35). 

In this study, we tested the ability of the NS3 serine 
proteinase domain (called NS3 181 ) to mediate fra/w-processing 
at each of the four downstream sites. All four sites could be 
cleaved in trans; however, requirements for taznj-cleavage 
varied for different sites, fra/w-cleavage at the 3/4A site was 
very inefficient, if there was any, when the substrate contained 
an inactivated serine proteinase. Coexpression of NS4A is 
required for cleavage at the 4B/5A site, but not at the 5A/5B 
site. We also tested the ability of the serine proteinases from 
two HCV subtypes (H and BK strains) to mediate trans- 
processing of heterologous HCV polyprotein substrates. Fi- 
nally, we used a vaccinia virus recombinant expressing the 
entire HCV polyprotein to examine the processing kinetics in 
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the NS region and the stability of HCV precursors and 
cleavage products. 

MATERIALS AND METHODS 

Cell cultures. The BHK-21 and CV-1 cell lines were ob- 
tained from the American Type Culture Collection, and the 
BSC-40 cell line (9) was obtained from D. Hmby (Oregon 
State University). Cell monolayers were grown in Eagle's 
minimal essential medium (MEM) supplemented with 2 mM 
L-ghitamine, nonessential amino acids, penicillin, streptomy- 
cin, and 10% fetal bovine serum (FBS). The A16 subclone of 
the human hepatoma HepG2 cell line, generously provided by 
Alan Schwartz (Washington University), was maintained in 
Dulbecco's modified Eagle medium supplemented with peni- 
cillin, streptomycin, and 10% FBS. 

Plasmid constructions. Standard recombinant DNA tech- 
niques (61) were used for construction of the expression 
plasmids described below. For all plasmids, regions of HCV-H 
coding sequence amplified by PCR were verified by DNA 
sequence analysis. 

Synthetic oligonucleotides and PCR were used to engineer 
initiation or termination codons as well as convenient restric- 
tion sites for subcloning (5' Ncol and 3' Xhol sites) for several 
HCV-H expression constructs. These constructs (with the 
encoded poryproteins given in parentheses) are as follows (Fig 
1): pTM3/HCV1027-1657 (NS3), pBRTM/HCV1027-1711 
(NS3-4A), pTM3/HCV1658-1711 (NS4A), pTM3/HCV1658- 
1972 (NS4A-4B), pTM3/HCV1658-2420 (NS4A-5A), pTM3/ 
HCV1712-2420 (NS4B-5A), pBRTM/HCV1712-3011 (NS4B- 
5B), pBRTM/HCV1973-3011 (NS5A-5B), and pTKD/HCV 
2421-3011 (Met-NS5B). The sequences encompassing the en- 
gineered initiation codons (boldface) are as follows (HCV-H 
sequence underlined): NS3, 5 ' -(XAT GGCGCCC -3 ' ; NS4A, 
5 ' -CCATGGCCAGCACC-3 ' ; NS4B, 5'-CCATGGCGICTC 
AG-3'; NS5A, 5^CCATGGGATCC£KiC-3'; and NS5B, 5'- 
CCATGCK3CK^AICt-3'. For the engineered termination 
codons (boldface), the surrounding sequences are as follows 
^CV-H sequence underlined): NS3, 5'-filEACGTCACTC 
GAG-3'; NS4A, 5'-QAQIQCTAGCrCGAG-3'; NS4B, 5'- 
j^AXQCTAGCTCGAG-3'; and NS5A, S'-TQCEQCTAGC 
TCGAG-3 . 

pTM3/Ubiquitin-HCV2421-3011 (Ubi-NS5B) was con- 
structed by ligation of two PCR-derived fragments into pTM3/ 
HCV2421-3011 (Met-NS5B). The initiating methionine of the 
ubiquitin monomer corresponds to the ATG in the Ncol site of 
pTM3. The ubiquitin (double underlined)-NS5B (underlined) 
junction was created by using a BamHl restriction site (bold- 
J"*) 38 /"Hows: CGC OPT MA ICC ATO TCT . The 
template for PCR amplification of the ubiquitin cassette was 
pTM3/Ub-nsP4 (Tyr) (44). 

Additional HCV-H egression plasmids (with the encoded 
poryproteins given in parentheses) were constructed by sub- 
cloning appropriate fragments from previously described con- 
tracts (Fig. 1). pTM3/HCV1027-1207 (NS3 181 ) was derived 
from pTM3/HCV1027-1657 (described above) and P TM3/ 
HCV827-1207 (31); pTM3/HCV1027-1676 (NS3-4A 19 ) was 
derived from pTM3/HCV1027-1657 and pBRTM/HCVl-1676 

SSj^SSiSS^S 7 " 16 ? < NS3 ' 4A 3 S ) was d^ved from 
PTM3^HCV1027-1657 and pBRTM/HCVl-1692 (32); pBR 

™^CV1W7;3011 S„«A (NS3-5B*) was derived from pBR 
T^?^,-, 3 , 011 s ii«A (30) and P BRTM/HCV1027-1711; 

S^^mT;^ 7 < NS3 >«-3i> was derived from pBRTM/ 
HCV1193-3011 (30) and pTM3/HCV1027-1657; and pTM3/ 

S^«??" 1 ^J,2? S3wr4A > was derived from PBRTM/HCV 
1193-3011, pBRTM/HCV1027-1711, and pTM3. pTM3/HCV 
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FIG. 1. HCV genome structure and expression constructs. (A) 
Diagram of the HCV-H strain poryprotein and its cleavage products 
shown as boxes. The identities of the mature proteins, including C El 

2& P ^ NS2, NS3 ' NS4A * NS4B ' NS5A » NS5B ' m indicated (32,' 
47). The number at the top of each cleavage product indicates the 
position of its N-terminal residue in the polyprotein sequence. The 
apparent molecular masses for HCV proteins (p) and glycoproteins 
(gp) are indicated under each product (in kilodaltons). Regions 
containing predominantly uncharged amino acids are indicated as 
black bars. Also shown are putative cleavage sites for host signal 
peptidase (♦) (33, 47), the HCV NS2-3 proteinase (0) (31, 34), andthe 
NS3 serine proteinase (4J-) (5, 22, 30, 34, 50, 69). (B) HCV-H 
polypeptide expression constructs used in this study. HCV polypeptide 
sequences present in each pBRTM/HCV or pTM3/HCV construct are 
indicated by black lines, which are drawn to scale and oriented with 
respect to the diagram of the HCV-H poryprotein. Numbers at the 
ends of each line refer to the first and last amino acids of the HCV 
polypeptide expressed by the particular construct. For simplicity the 
NS prefix is not used for the nomenclature of each encoded polypep- 
tide, which is indicated on the left (Q HCV-BK polypeptide expres- 
sion constructs. (See the legend to panel B for details.) 



1658-1676 (NS4A 19 ) was constructed by deleting the Hincll- 
Nhel fragment of pTM3/HCV1658-1711 (the Nhel site was 
filled in by using T4 DNA polymerase prior to ligation) 
PTM3/HCV1658-1692 (NS4A 35 ) was generated by deleting the 
NaehNhel fragment of pTM3/HCV1658-1711 (the Nhel site 
W ^ £ U * d ™ ^ usin fi T4 DNA Polymerase prior to ligation). 
PTNO/HCV2269-2508 (NSSA^-SB^) was made by surX- 
mg the 1,274-bp BsahBglU fragment from pTM3/HCVl-2508 
(32) into pTM3 , digested with Ncol and Btfll (the Bsal and 
Ncol sites were filled in by T4 DNA polymerase prior to 
ligation). DTM3/HCV2285-2508 (NS5A 313 -5B 88 ) was con- 
structed by subcloning the 1,227-bp Apal-Bglll fragment of 
PTM3/HCV1-2508 (32) into pTM3, which had been previously 
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digested with Ncol and BgUl (theApal and Ncol cleavage sites 
were trimmed and filled in, respectively, by T4 DNA poly- 
merase prior to ligation). 

Expression constructs for the HCV BK strain (HCV-BK) 
were made with cDNA clones generously provided by H, 
Okayama and A. Takamizawa (67). pTM3/HCV-BK1027-1207 
[encoding polypeptide NS3 18l (BK)J was constructed by sub- 
cloning a PGR fragment amplified from pUC19/BK-146 (67) 
into pTM3. The sequences encompassing the engineered ini- 
tiation and termination codons (boldface) include (HCV-BK 
sequences underlined) an Ncol site at the N terminus (5'- 
CCATGQCTCCC-3') and a BamHI site at the C terminus 
(5 '-CQGIXHTAATAGGATCC-3'). pBRTM/HCV-BK1221- 
3011 was produced by subcloning appropriate fragments from 
four HCV-BK cDNA clones, including pUC19/BK-102 f BK- 
112-1, BK-112-5, and BK-166. Because the HCV-BK coding 
sequence in pUC19/BK-102 clones was fused in frame to the 
AUG codon in the Ncol site of the adaptor sequence, pBRTM/ 
HCV-BK1221-3011 encodes a polyprotein [NS3 195 -5B(BK)] 
encompassing HCV-BK residues 1221 to 3011 after the initi- 
ating methionine. 

Generation and growth of vaccinia virus-HCV recombi- 
nants. vHCV1027-1207 was generated by marker rescue of 
pTM3/HCV1027-1207 (49). Recombinant viruses were plaque 
purified three times under gpt selection (25) prior to growth of 
large-scale stocks. A vaccinia virus-HCV recombinant encod- 
ing the entire HCV-H open reading frame, vHCVl-3011, has 
been described pre vious ly (47). Stocks of vHCV1027-1207, 
vHCVl-3011, and vTF7-3, a vaccinia virus recombinant ex- 
pressing the T7 DNA-dependent RNA polymerase (28), were 
grown in BSC-40 monolayers and partially purified (37), and 
titers of infectious progeny were determined by plaque assay 
on BSC-40 cells (37). 

Transient expression with the vaccinia virus-T7 hybrid 
system. For expression assays utilizing vaccinia virus-HCV 
recombinants, monolayers of HepG2-A16 or BHK-21 cells in 
35-mm-diameter dishes were infected with vTF7-3 alone or in 
combination with vHCVl-3011, vHCV827-3011 (32), or 
VHCV1027-1207. The multiplicity of infection for each recom- 
binant was 10 PFU per cell. After adsorption for 60 min at 
room temperature, the inoculum was removed and replaced 
with MEM containing 2% FBS., Expression assays of trans- 
fected plasmid constructs utilized subconfiuent monolayers of. 
BHK-21 cells that had been previously infected with vTF7-3 as 
described above. Some of them were also coinfected with 
VHCV1027-1207. After removal of the inoculum, cells were 
transfected for 2 h at 37°C with a mixture consisting of 1 ug of 
plasmid DNA and 10 jig of Lipofectin (Bethesda Research 
Laboratories) in 0.5 ml of MEM. If two constructs were used 
in a single transfection, the amount of each plasmid varied 
from 0.5 \ig to 1 jig, with a total of 1.5 u,g of DNA mixed with 
15 fig of Lipofectin. 

For pulse-chase experiments, monolayers were washed once 
with prewarmed methionine-deficient MEM at 3 h postinfec- 
tion and incubated in the same medium for 20 min at 37°C. 
Cells were labeled by incubation for 20 min at 37°C with 
methionine-deficient MEM supplemented with 100 |xCi of 
35 S-protein labeling mixture (NEN) per ml. For chase experi- 
ments, the labeling mixture was replaced with MEM contain- 
ing 2% FBS, 1.5 mg of methionine per ml, and 100 ftg of 
cycloheximide per ml and incubated for the indicated periods 
at 37°C. For steady-state labeling, cell monolayers were washed 
once at 3 h postinfection as described above and then were 
incubated for 4 h at 37°C with MEM containing l/40th the 
normal concentration of methionine and cysteine, 2% FBS, 
and 40 u.Ci of 35 S-protein labeling mixture per ml. 



Cell lysis, immunopreclpitatlon, and protein analyses. After 
labeling, cell monolayers were washed with phosphate-buff- 
ered saline and lysed with a solution of 05% sodium dodecyl 
sulfate (SDS), 50 mM Tris-Q (pH 7.4), 1 mM EDTA, and 20 
jtg of phenyimethylsulfonyl fluoride per ml (0.3 ml/10 6 cells). 
Cellular DNA was sheared by repeated passage through a 
27.5-gauge needle. Prior to immunoprecipitation (47), lysates 
were heated to 70°C for 10 min. Portions of each lysate were 
incubated either with 5 to 10 uJ of the indicated rabbit 
polyclonal antisera or with 2 uJ of serum JHF from an 
HCV-positive patient (32). Immune complexes were collected 
by using Staphylococcus aureus Cowan I (Calbiochem) as 
described previously (59), solubilized, and analyzed by SDS- 
poryacrylamide gel electrophoresis (PAGE) (42) or Tricine- 
SDS-PAGE (62). After treatment for fluorography with 
En 3 Hance (DuPont), gels were dried and exposed at -70°C 
with prefogged (43) X-ray film (Kodak). l4 C-methylated mo- 
lecular weight marker proteins were purchased from Amer- 
sham. 

Cell-free translation. The 5'-uncapped RNA transcripts 
were synthesized from linearized cDNA templates with T7 
DNA-dependent RNA polymerase (Epicenter) (58). Cell-free 
translation mixtures with rabbit reticulocyte lysates (Promega) 
and t 35 S]methionine (Amersham), were incubated for 1 h at 
30°C essentially according to the manufacturer's instructions. 
The translation reactions were terminated by the addition of 
RNase A (Boehringer Mannheim) to 10 u,g/ml, cycloheximide 
to 0.3 mg/ml, and cold methionine to 1 mM. A portion of the 
translation reaction mixtures was removed at the indicated 
time, diluted 10-fold with the Laemmli sample buffer, heated 
for 5 min at 95°C, and analyzed by SDS-PAGE as described 
above. 

RESULTS 

fra/w-CIeavage at all four serine proteinase-dependent sites. 
The serine proteinase domain of HCVs was initially identified 
on the basis of sequence homology to members of the trypsin 
superfamily (7, 29). The predicted domain is approximately 
180 residues and corresponds to the N-terminal one-third of 
NS3. This enzyme is required for processing in the NS3-4-5 
region of the HCV polyprotein, and alanine substitutions for 
predicted active site residues (His-1083 or Ser-1165 for 
HCV-H) abolish cleavage at the 3/4A, 4A/4B, 4B/5A, and 
5A/5B sites (5, 22, 30, 34, 50, 69). To purify and characterize 
this enzyme, we have used the vaccinia virus-T7 hybrid expres- 
sion system to examine the ability of the predicted serine 
proteinase domain, expressed as an individual polypeptide 
(NS3 181 ), to mediate franj-cleavage at each of these four sites. 

The first substrate examined was an NS3-5B polyprotein 
containing the Ala substitution at Ser-1165 (NS3-5B*) (Fig. 1). 
This mutation completely inactivates the serine proteinase, 
and no processed products were observed (Fig. 2). When 
coexpressed with NS3 181 , cleavage occurred at the 4A/4B, 
4B/5A, and 5A/5B sites, as evidenced by the appearance of 
NS4B (Fig. 23), NS5A (Fig. 2C), and NS5B (Fig. 2D). In 
contrast, we observed a more-slowly-migrating NS3-specific 
product, presumably NS3-4A, in addition to a very faint band 
corresponding to NS3 (Fig. 2A). This suggests that very 
inefficient, if any, taw-cleavage occurred at the 3/4A site of 
this substrate. 

The lack of rra/u-cleavage at the 3/4A site has been observed 
in other studies and has led to the proposal that this site can 
only be cleaved in cis (5, 69). However, all substrates examined 
thus far contained an inactivated NS3 serine proteinase do- 
main, which might interfere with the accessibility of the 3/4A 
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FIG. 2. fro/tf-Processing of the HCV-H NS3-5B* polyprotcin. 
BHK-21 cell monolayers were infected with VTF7-3 alone (m) or in 
combination with vHCV827-3011 (v) or vHCV1027-1207 (3 l81 ). Some 
monolayers were also transfected with pBRTM/HCV1027-3011 S nfiJ A 
(3-5B*). Cells were metabolically labeled with 35 S-protein labeling 
mixture as described in Materials and Methods. Cell lysates were 
immunoprecipitated with the following HCV-spccific antisera: NS3- 
specific WU117 (A), NS5A-specific WU123 (C), NS5B-specific 
WU115 (D), or human patient serum JHF (B). It should be noted that 
the NS3 serine proteinase domain is not recognized by either human 
patient serum JHF or rabbit antiserum WU117, which was raised 
against the NS3 hclicase domain. Immunoprecipitated proteins were 
solubilized and separated by electrophoresis on 8% (A, C, and D) or 
14% (B) polyacrylamide-SDS gels. HCV-specific proteins are indi- 
cated on the right, and the sizes of l4 C-labeled protein molecular mass 
markers (in kilodaltons) are indicated on the left. 



site for fra/w-cleavage. To test this possibility, we expressed a 
poryprotein, NS3 167 -5B, which begins with residue 167 of NS3 
and therefore lacks the majority of the serine proteinase 
domain. Marker proteins were also expressed beginning with 
NS3 residue 167 and extending to the C terminus of NS3 
(NS3 16:Mai ) or NS4A (NS3 16r 4A) (Fig. 1). Processed prod- 
ucts were not observed when NS3i 67 -5B was expressed alone 
(Fig. 3). During coexpression with NS3 18l , two NS3-specific 
cleavage products were observed: a major product comigrating 
with NS3 167-631 and traces of a larger species comigrating with 
NS3 167 -4A (Fig. 3). These results clearly demonstrate that 
NS3 m can mediate efficient tra/w-cleavage at the 3/4A site of 
a substrate which lacks the inactivated proteinase domain. 

In contrast to the flaviviruses, where NS2B is absolutely 
required for NS3 serine proteinase activity (11, 24, 57), HCV 
sequences upstream of NS3 are not required for serine pro- 
teinase-dependent cleavages (5, 22, 30). However, the poten- 
tial role of downstream viral polypeptide sequences in prote- 
olysis has not been examined. To address this possibility, we 
tested fra/w-cleavage of NS4A-4B, NS4B-5A, and NS5A-5B 
substrates, each of which contained only a single proteinase- 
dependent cleavage site (Fig. 1). When expressed alone, only 
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FIG. 3. Requirements for taw-cleavage at the 3/4A site. BHK-21 
cell monolayers were infected with vTF7-3 alone (m) or in combina- 
tion with VHCV827-3011 (v) or VHCV1027-1207 (3 18I ). Some mono- 
layers were also transfected with pBRTM/HCV1193-3011 (3 lfi7 -5B) 
PTM3/HCV1193-1657 (3^ 31 ), or pTM3/HCVl 193-17 11 (3 l6 r4A)! 
Cell9 were labeled with ^S-protein labeling mixture as described in 
Materials and Methods. HCV NS3*peciiic products were immunopre- 
cipitated with rabbit antiserum WU117, solubilized, and analyzed by 
SDS-PAGE (8% poryacrylamide). HCV-specific proteins are indicated 
on the right, and the sizes of u C-labeled protein molecular mass 
markers (in kilodaltons) are indicated on the left. 



the appropriate unprocessed polyproteins were present (Fig. 
4). When coexpressed with NS3 181 , NS4A-4B was processed to 
yield NS4A and NS4B (Fig. 4A), and NS5A-5B yielded NS5 A 
and NS5B (Fig. 4C). To develop shorter substrates convenient 
for in vitro proteinase assays, we examined taw-processing of 
NS5A 297 -5B 88 and NSSAa^-SB^, which contain the C-termi- 
nal 152 and 136 residues of NS5A, respectively, followed by the 
N-terminal 88 amino acids of NS5B (Fig. 1). NS5 297 -5B 88 was 
processed efficiently by NS3 I81 as evidenced by the conversion 
of most of NSSA^T-SBe,, to NSSA^^. Nearly complete 
frww-cleavage at the 5A/5B site was also observed for 
NSSAa^-SBgg (Fig. 4D). These results indicate that only 
limited flanking sequences are necessary for efficient trans- 
cleavage at the 5A/5B site by the NS3 181 serine proteinase. 
Since these substrates do not overlap, these data exclude an 
absolute requirement for one of the downstream viral polypep- 
tides for serine proteinase activity. In contrast to the results 
with the NS4A-4B, NS5A-5B, and NS3-5B* substrates, how- 
ever, no fra/w-cleavage of NS4B-5A was observed (Fig. 4B). 

NS4A is required for cleavage at the 4B/5A site. Since 
fr<mf-cleavage at the 4B/5A site occurred for the NS3-5B* 
substrate but not for NS4B-5A, we examined *raw-cleavage of 
NS4A-5A and NS4B-5B poiyprotein substrates (Fig. 1). When 
coexpressed with NS3i 81 , the 4B/5A cleavage occurred in the 
NS4A-5A substrate (Fig. 5A, lane 4) but not in NS4B-5B (data 
not shown). These results suggested that, in addition to the 
NS3 18 , proteinase domain, NS4A was required for cleavage at 
the 4B/5A site. The requirement for NS4A was strengthened 
by the observation that processing of NS4B-5A was restored by 
coexpression of NS4A in trans. Cleavage at the 4B/5A site of 
this substrate occurred when NS4A was expressed either as 
part of the proteinase (NS3-4A) (Fig. 5A, lane 9) or as an 
individual polypeptide (NS4A) together with NS3 181 (Fig. 5A, 
lane 7). These results clearly demonstrate that NS3-mediated 
cleavage at the 4B/5A site requires NS4A, a small protein of 54 
ammo acids with a hydrophobic N-terminal half and a C- 
terminal half rich in charged residues (see Discussion). 

Since NS3-4A was fully active for tamj-cleavage at the 
4B/5A site, we made two constructs with C-terminal deletions 
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KG. 4. fro/u-Processing of HCV-H polyprotcins containing only 
one serine proteinase-dependent site. BHK-21 cell monolayers were 
infected with vTF7-3 alone (m) or in combination with vHCV827-3011 
(v) or VHCV1027-1207 (3 l61 ). As indicated, some monolayers were 
also transfected with P TM3/HCV1658-1972 (4A-4B), pTM3/ 
(4B * 5A) * PBRTM«CV1TO-3011 (5A-5B), plili 
HCV2^2508 (SA^B*,), or P TM3/HCV2285-2508 (SA^-SbT) 
These BHK-21 cells were labeled with 35 S-protein labeling nature as 
described m Material and Methods. Cell lysates were immunoprecipi- 
tated with human patient serum JHF (A) or the following ThCV- 
speafic rabbit antisera: NSSA-specific WU123 (B and Q, NS5B- 
specific WU115 (C), and WU113 specific for both NS5A and NS5B 
(D). Apparently, rabbit antiserum WUU3, which was raised against a 
fusion protein containing the C-terminal 109 residues of NS5A and the 
N-termmal 203 residues of NS5B, recognizes only the NS5A region but 
not the NS5B sequences in SA^SB^ and SA^-SB*,. Immunopre- 
apitated proteins were solubilized and separated by electrophoresis on 
14% (A and D) or 8% (B and C) poryacrylamide-SDS gels. HCV- 
speafic proteins are indicated on the right, and the sizes of ^C-labeled 
protein molecular mass markers (in kUodaltons) are indicated on the 
left In panel A, NS4A is difficult to visualize because it contains only 
a single methionine residue (compared with six in NS4B) and migrates 
as a diffuse band on this gel system. 



in the NS4A region to map NS4A sequences required for this 
activity. NS3-4A 35 and NS3-4A 19 contain the full-length NS3 
followed by the N-terminal 35 and 19 residues of NS4A, 
respectively (Fig. 1). As evidenced by production of NS5a! 
NS3-4A 35 (Fig. 5A, lane 10), but not NS3-4A 19 (lane 11), was 
able to process NS4B-5A. In an earlier study, similar constructs 
were generated to map the location of NS4A (32). A polypro- 
tein beginning with the C protein and extending through the 
N-terminal 35 residues of NS4A was efficiently processed at 
the 3/4A site. However, a C-terminal truncation to residue 19 
of NS4A appeared to block cleavage at the 3/4A site (32). 
Thus, the inability of NS3-4A 19 to function for *ra/u-cleavage 
of NS4B-5A might result from lack of cleavage at the 3/4A site 
and release of the NS4A N terrninus rather than deletion of 
NS4A residues 20 to 35. To address this possibility, we 
examined the activity of polypeptides encompassing the N- 
tenninal 19 and 35 residues of NS4A (called NS4A 19 and 
NS4A 35 , respectively). NS4A 35 , but not NS4A 19 , was able to 
induce mwu-cleavage of NS4B-5A by NS3 181 (Fig. 5B). These 
results indicate that the C-terminal 19 amino acids (residues 36 
to 54) of NS4A, which contain 8 to 9 highly conserved, charged 
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FIG. 5. Requu-ements for fritwj-cleavage at the 4B/5A site. BHK-21 
cell monolayers were infected with vTF7-3 alone (m) or in combina- 
tion with VHCV827-3011 (v) or VHCV1027-1207 (3 t l ) M Sted 
some monolayers were also transfected with the following olasmids- 
PTM3/HCV1658-2420 (4A-5A), pT™cV1712-2Swff^ 

1676 (3-4A 19 ), pTM3/HCV1027-1692 (3-4A„), pBRTMmCV1027- 
JJi 1 PTM3/HCV1658-1676 (4A 19 ), and P m3/HCV1658-1692 

(4A 35 ) Cells were labeled with ^S-protein labeling mixture as de- 
scribed in Materials and Methods. HCV-specific products were inimu- 
noprecipitated with NS5A-specific antiserum WU123 (A) or human 
patient serum JHF (B), solubilized, and separated by 8% (A) or 10% 
£) potyacrylamide-SDS eels. HCV-specific proteins are indicated on 
the nght, and the sizes of l4 C-Iabeled protein molecular mass markers 
(in kilodaltons) are indicated on the left. 



residues (see Discussion), are not required for frow-cleavaee 
at the 4B/5A site. 

Cleavage at the 3/4A and 4A/4B sites, which flank NS4A, 
may also require NS4A sequences for efficient cleavage (see 
Discussion). However, since the 5A/5B site can be efficiently 
cleaved in the absence of NS4A (Fig. 4C and D), this protein 
is not absolutely required for NS3 serine proteinase activity. 
For development of an in vitro proteinase assay that does not 
require NS4A, substrates containing the 5A/5B site should be 
good candidates. 

frwu-CIeavage between HCV-H and HCV-BK strains. Viral 
proteinases, which are important for polyprotein processing 
and viral replication, present attractive targets for develop- 
ment of antiviral therapeutic agents. Since sequence analysis of 
HCV isolates has uncovered considerable genetic diversity, the 
success of a proteinase inhibitor strategy will depend at least in 
part on the conservation of proteinase-substrate interactions 
among different HCV types. In one classification scheme, six 
major genotypes or types (from 1 to 6) are distinguished, with 
some types further divided into related subtypes (64, 65) 
HCV-H (26) and HCV-BK (67) are members of the la and lb 
subtypes, respectively, which represent the major subtypes in 
the United States and Japan. These two strains share 90 and 
87% amino acid sequence identities in the serine proteinase 
domain and in the NS3-5B region, respectively. To examine 
the conservation or divergence of proteinase-substrate inter- 
actions among different HCV strains, we compared the ability 
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FIG. 6. <ra/tf-Cleavage between HCV-H and HCV-BK polypep- 
tides. BHK-21 cell monolayers were infected with vTF7-3 alone (m) or 
in combination with vHCV827-3011 (v). For *ra/u-cleavage experi- 
ments, the polyprotein substrates were pBRTM/HCV1193-3011 for 
the H strain (H) or pBRTM/HCV-BK1221-3010 for the BK strain 
(BK). The serine proteinase domains of both strains were expressed as 
the source of the proteolytic activities: vHCV1027-1207 for the H 
strain and pTM3/HCV-BK1027-1207 for the BK strain. The absence 
(-) of certain expression constructs is indicated. Cells.were labeled 
' with S-protein labeling mixture as described in Materials and Meth- 
ods. HCV-specific products were immunoprecipitated with the follow- 
ing antisera: NS3-spccific WUU7 (A), NS5A^pecific WU123 (C), 
NSSB-speonc WU115 (D), or human patient serum JHF (B). The 
immunoprecipitated proteins were solubilized and separated on $% 
(A, C, and D) or 14% (B) polyacrylamide-SDS eels. HCV-specific 
proteins are indicated on the right, and the sizes of ft C-labeled protein 
molecular mass markers (in kilodaltons) are indicated on the left. 
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FIG. 7. Pulse-chase analysis of processing in the NS3-4-5 regions. 
HepG2-A16 cells were infected with vTF7-3 alone (ra) or coinfected 
with vTF7-3 and vHCVl-3011 (v), pulse labeled with "S-protein 
labeling mixture for 20 min, and chased for the indicated time as 
described in Materials and Methods. Cell tysates were prepared and 
immunoprecipitated with the following antisera: NS3-specific WU117 
(A), NS5A-specific WU123, or NS5B-specific WU115 (C) or human 
patient serum JHF (B). Immunoprecipitated proteins were solubilized 
and separated by SDS-FAGE (8% polyacrylamide) (A and C) or 
T r^f DS ' PAGE (14% P ol yacrylamide) (B). The migration pattern 
of HCV NS5A-specific polyprotein markers is shown on the left in 
panel C BHK-21 cells previously infected with vTF7-3 were mock 
transacted (m) or transfected with the indicated plasmids and labeled 
with S-protein labeling mixture as described in Materials and Meth- 
ods. HCV-specific proteins are identified on the right, and the sizes of 
C-labeled protein molecular mass markers (in kilodaltons) are 
indicated on the left. 



of the NS3 serine proteinases of the HCV-H of the HCV-BK 
strain to mediate tai/w-cleavage of homologous or heterolo- 
gous polyprotein substrates. 

For the H strain, we used the NS3 l81 proteinase and the 
NS3 167 -5B substrate described above (Fig. 1). For the 
HCV-BK proteinase, we made a similar construct expressing 
the N-terminal 181 amino acids of HCV-BK NS3 [NS3 18I 
(BK)]. The HCV-BK substrate was a polyprotein begirjning 
with residue 195 of NS3 and extending through NS5B [NS3 195 - 
5B(BK)]. When NS3 167 -5B was coexpressed with NS3 181 
processing at all four sites occurred, as evidenced by the 
appearance of NS3 167h631 , NS4A, NS4B, NS5A, and NSSB 
(Fig. 6). For the BK strain, NS3 18l (BK) was able to mediate 
frww<leavage at the 3/4A, 4A/4B, and 4B/5A sites of NS3 195 - 
5B(BK), as indicated by the production of NS3 1M _ fi31 (BK), 
NS4A, and NS4B (Fig. 6A and B). Thus far, we h£e I been 
unable to identify the HCV-BK NS5A and NSSB cleavage 
products by using HCV-H NS5A- or NS5B-specific rabbit 
antisera or HCV-positive patient antisera collected in the 
United States. As shown in Fig. 6, the serine proteinase 
domain of either strain was fully active at mediating trans- 
cleavage of the heterologous substrate from the other strain. 

Cleaved N ? 3 i6r5B of H strain to NS3 167 
NS4A, NS4B, NS5A, and NS5B (Fig. 6). Likewise, NS3*^ 



5B(BK) was processed by NS3 181 of the H strain to produce 
NS3 195h631 (BK), NS4A, and NS4B (Fig. 6A and B). Thus, at 
least as assessed by this /raw-processing assay, these two 
different HCV subtypes do not appear to have diverged 
significantly in terms of serine proteinase-substrate recogni- 
tion. fe 

Kinetics or processing in the HCV NS region. Besides 
defining the minimal domains required for serine proteinase 
activity, it is also of interest to understand the processing 
reactions that occur in the full-length HCV polyprotein. In 
other viral systems, polyprotein cleavages that occur in cis 
versus those occurring in trans can be important for regulating 
RNA replicase function. Such regulation is possible when 
polyprotems, processing intermediates, and mature cleavage 
products have distinct roles in replication (44). To begin to 
examine processing pathways and kinetics in the NS3-4-5 
region, pulse-chase experiments were carried out in HepG2- 
A16 cells by using a vaccinia virus-HCV recombinant, vHCVl- 
3011, which expresses the entire HCV-H polyprotein (47) As 
shown in Fig. 7A, NS3 was readily visible after a 20-min riulse 
and was not associated with any higher-molecular-mass 
polyprotein precursors, indicating that cleavage at both the 2/3 
and 3/4A sites occurs very rapidly, possibly in cis 

In contrast, processing in the NS4-5 region was generally 
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slower and a number of processing intermediates were readily 
identified. As shown in Fig. 7B, NS4B was readily visible after 
a 20-min pulse. A 29-kDa protein comigrating with the product 
expressed from pTM3/HCV1658-1972 (NS4A-4B) was identi- 
fied as the NS4A-4B polyprotein (Fig. 7B). A decrease in the 
level of NS4A-4B was accompanied by an increase in the 
amount of NS4B, which suggests that NS4A-4B can be a 
precursor for NS4B and NS4A. NS4A was not observed in this 
experiment, probably because of its low methionine content 
(only one) and inefficient expression by vHCVl-3011 (32). 
Four predominant NS5A-containing polyproteins of 160, 135, 
87, and 82 kDa were observed after the 20-min pulse (Fig. 7C)! 
NS4-specific antiserum recognized all of these species except 
for the 135-kDa polyprotein (data not shown), whereas the 
NS5B-specific antiserum recognized only the 160- and 135-kDa 
polyproteins (Fig. 7C). On the basis of their apparent molec- 
ular mass, immunoreactivity, and comigration with marker 
polyproteins (Fig. 7C), these four polyproteins were tentatively 
identified as NS4-5B (160 kDa), NS5A-5B (135 kDa), 
NS4A-5A (87 kDa), and NS4B-5A (82 kDa). It is unclear 
whether the 160-kDa polyprotein NS4-5B begins with NS4A or 
NS4B or is a mixture of both of these species. The presence of 
these four polyproteins suggests that there are several alterna- 
tive pathways for processing the NS4-5 region (see Discussion 
for more details). Over a 6^min chase, the level of NS5A (58 
kDa) increased significantly and was accompanied by a de- 
crease in the levels of NS5A-5B and NS4-5B (Fig. 7C) t 
suggesting that these two polyproteins may be the precursors 
to NS5A. Because the levels of NS4B-5A and NS4A-5A 
increased initially, and then decreased during the chase period 
(Fig. 7Q, they probably represent processing intermediates 
between NS4-5B and NS5A, An NS5A-specffic protein of 62 
kDa (indicated as b in Fig. 7Q, barely detectable after 15 min 
of chase, became more apparent after 60 min. In a previous 
study, several minor NS5A-specific species with slower mobil- 
ity were observed in addition to the dominant 58-kDa NS5A 
protein (32). Two additional faint bands of 107 and 47 kDa 
(labeled a and c, respectively, in Fig. 7C) were observed with 
the NS5B-specific antiserum. Product a was also recognized by 
NS5A-specific antiserum. These two proteins remain to be 
defined, but they may reflect additional proteolytic processing 
within the NS5B region. Although NS4-5B and NS5A-5B were 
likely precursors to NS5B, the level of NS5B did not change 
significantly over a 60-min chase period (Fig. 7Q, and this 
protein appeared to be unstable relative to most of the other 
HCV-encoded proteins (see below). On the other hand, NS3 
(Fig. 7A) and NS4B (Fig. 7B) were stable up to 2 h, while a 
slight decrease in the level of NS5A was observed (Fig. 7C). 

Instability of the NS5B protein. While NS3 was very stable 
during prolonged chase periods, NS5A and, in particular, 
NS5B, the putative HCV RNA polymerase, appeared to be 
rather unstable. NS5A disappeared with a half-life of approx- 
imately 170 min, and NS5B disappeared with a half-life of 
about 70 min (data not shown). This observation is potentially 
interesting because some positive-strand viruses tightly regu- 
late the level of their RNA-dependent RNA polymerase. 
Additionally, the p75 protein of bovine viral diarrhea virus, the 
HCV NS5B homolog, is unstable in bovine viral diarrhea 
virus-infected cells (21). To detennine whether other HCV- 
encoded proteins might be responsible for the instability of 
NS5B, we expressed two different forms of HCV NS5B. One 
form (Met-NS5B) included the entire NS5B region preceded 
by two non-HCV residues, Met-Gly. A second construct 
encoded a ubiquitin fusion protein consisting of the 76-residue 
ubiquitin monomer fused in frame to the N terminus of NS5B 
(Ubi-NS5B). Cleavage of this ubiquitin fusion protein by 
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FIG. 8. Stability of NS5B expressed in BHK-21 cells or by cell-free 
translation. (A) BHK-21 cells previously infected with vTF7-3 were 
mock transfected (m) or transfected with one of the followinc ptas- 
mids: pTM3/HCV2421-3011 (Met-5B) or pTM3/Ubiquitin-HCV2421- 
3011 (Ubi-5B). The ceil monolayers were pulse-labeled with 35 S- 
protein labeling mixture for 20 min and chased for the indicated times 
as described in Materials and Methods. Cell lysates were prepared, and 
HCV NS5B-$pecific products were immunoprecipitated by the rabbit 
antiserum WU115. Immunoprecipitated proteins were solubiiized and 
separated by SDS-PAGE (8% polyacrylamide). HCV-specific proteins 
are identified on the right, and the sizes of I4 CMabeled protein 
molecular mass markers (in kilodaltons) from Amersham are indicated 
on the left. (B) Translations with RNA transcripts from pTM3/ 
HCV2421-3011 or pTM3/Ubiquitin-HCV2421-3011 or without any 
transcript (m) were incubated for 60 min at 30°C in reticulocyte rysate 
in the presence of [^methionine. Translation reactions were termi- 
nated by the addition of RNase A, cycloheximide, and excess cold 
methionine and then were chased for the indicated times. The 
translation products were solubiiized and analyzed by SDS-PAGE 
(14% poryacrylamide). The identities of proteins are shown on the 
right, and the sizes of ,4 C-labeIed protein molecular mass markers (in 
kilodaltons) are indicated on the left. 



cellular ubiquitin carboxy-tenninal hydrolase should produce 
NS5B with its authentic N-terniinal Ser residue (4). As shown 
in Fig. 8A, the Ubi-NS5B fusion protein was completely 
processed to NS5B after a 20-min pulse of transfected BHK-21 
cells. Both forms of the NS5B proteins were unstable, as 
evidenced by the rapid decline in the level of NS5B (Fig. 8A). 
The approximate half-lives were 90 min for Met-NS5B and 70 
min for NS5B produced by cleavage of Ubi-NS5B. Coexpres- 
sion of NS3 181 had no significant effect on the stability of NS5B 
(data not shown). These results indicate that the instability of 
NS5B is not due to the presence of other HCV-encoded 
proteinases or proteins; Rather, NS5B is inherently unstable 
and is probably degraded through a cellular pathway. 

In an attempt to devise an in vitro assay to study NS5B 
degradation, we examined the stability of Met-NS5B or Ubi- 
NS5B produced by cell-free translation of RNA transcripts in 
rabbit reticulocyte lysates (Fig. 8B). Translation reactions were 
terminated by addition of RNase A, cycloheximide, and excess 
cold methionine and were chased for the indicated periods. 
Control experiments showed no further incorporation of 
[^SJmethionine after the addition of these three reagents 
(data not shown). Ubi-NS5B fusion proteins were completely 
processed to NS5B and ubiquitin (predicted molecular mass of 
8.6 kDa) after a 60-min incubation. Although slight decreases 
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/• !i ^enment of NS4A sequences. The predicted NS4A amino acid sequences are aligned for selected HCV isolates from six subrvoes 
(indicated on the left): HCV-H (38), HCV-1 (18), HOJ1 (accession no. D10749), HCVJ (39) HCV^JTrXr HCV?BK hA/ t 
S^ 1 ^ 011 n T S18 2 30) : ( 53 XHC-J6 (55)^8 (54), and HCvV^^ 2* S amino IS hi 2" 

Hyphens indicate .residues identical to those of the HCV-H strain sequence. The 14-residue segment (residues 22 to 35 
fnw-cleavage at the 4B/5A site is shaded. Accession numbers for unpublished sequences are given above in parentheses. lm P Ilcated ' n 



in the levels of the proteins were apparent over the 2-h chasej 
both Met-NS5B and NS5B produced by cleavage of Ubi-NS5B 
were quite stable in the reticulocyte rysates, making it difficult 
to assess the role of the ubiquitin-mediated degradation path- 
way (which is present in reticulocyte rysates [191) in the 
turnover of NS5B. 



DISCUSSION 

It has been previously shown that an active NS3 serine 
proteinase is required for processing at four cleavage sites in 
the HCV NS3-4-5 region. The results presented here clearly 
demonstrate that the proteinase domain, expressed as a 181- 
residue N-terminal fragment of NS3, is able to mediate 
rra/w-cleavage at all four sites, Bartenschlager et al. (6) recently 
reported similar results showing that a fragment of the 
polyprotein, including the 212 N-terminal residues of NS3 and 
a 20-residue extension into the NS2 region, could also mediate 
r/ww-cleavage at the 4A/4B, 4B/5A, and 5A/5B sites. Our 
results, as well as those of two recent studies (6, 23), indicate 
that NS4A is absolutely required for the 4B/5A cleavage. Failia 
et al. (23) also showed that NS4A of HCV-BK, supplied in 
trans, was required for cleavage at the 3/4A and 4B/5A sites 
and improves the efficiency of processing at the 4A/4B and 
5A/5B sites. On the basis of these results, it was suggested that 
NS4A functions as a general effector or cofactor for NS3 serine 
proteinase-mediated cleavage in the NS3-4-5 region. Virus- 
encoded cofactors required for serine proteinase activity have 
also been found for other members of the family Flaviviridae. 
The most dramatic example is the NS2B protein of flavrviruses, 
which is absolutely required for NS3 serine proteinase-medi- 
ated cleavage at all structural and nonstructural dibasic sites 
(2, 11, 15, 24, 45, 48, 56, 57, 70, 72). As discussed by Failla et 
al. (23), the pestivirus plO protein may be the functional 
homolog of HCV NS4A, because sequences in this region of 
the pestivirus polyprotein appear to be required for the serine 
proteinase-dependent cleavage between p58 and p75 (the two 
C-terminal products of the pestivirus polyprotein possibly 
equivalent to HCV NS5A and NS5B, respectively) (71). For 
HCV, NS4A is required for only three cleavages mediated by 
the serine proteinase (3/4A, 4A/4B, and 4B/5A). While Failla 
et al showed that NS4A can increase frwu-cleavage efficiency 
at the 5A/5B site (23), we found that certain substrates 
containing this site could be processed efficiently in the ab- 
sence of NS4A. Hence, the HCV serine proteinase-dependent 
cleavages can be separated into at least two types: (i) cleavages 



at the 3/4A, 4A/4B, and 4B/5A sites, which are located adjacent 
to hydrophobic sequences and require NS4A as a cofactor; and 
(ii) cleavage at the 5A/5B site, which can occur in the absence 
ofNS4A. 

Although the mechanism(s) by which NS4A functions in 
proteolytic processing at type 1 sites remains to be determined, 
several possibilities can be envisioned, (i) NS4A may act as a 
molecular chaperone to facilitate folding of the serine protein- 
ase domain into an active enzyme. If the active form of the 
proteinase is the same for cleavage at both type 1 and type 2 
sites, then this model implies that type 1 substrates are 
suboptimal and require higher concentrations of active pro- 
teinase for efficient rra/w-cleavage. (ii) NS4A may bind to type 
1 substrates, the proteinase domain, or both to facilitate 
proteinase-substrate interactions and cleavage, (iii) NS4A may 
facilitate proteolysis of membrane-associated type 1 substrates 
by interacting with the proteinase domain and localizing it to 
the membrane compartment. Given that NS4A is required for 
cleavage at three different sites, it is tempting to propose that 
it functions via direct interaction with the proteinase domain: 
Thus far, unlike the flavivirus proteinase, which consists of a 
stable complex of NS2B and NS3 (3, 14), there is no direct 
evidence for association between the HCV NS3 and NS4A 
proteins. Suggestive evidence has been obtained, however, 
from in vitro studies in which NS3 was found to become 
membrane associated when the cell-free translation product 
included the NS4A region (35). 

Although NS4A is only 54 residues in length, we showed that 
a fragment of only 35 N-terminal residues, coexpressed with 
the serine proteinase domain, was sufficient for fww-cleavage 
at the 4B/5A site. Failla et al. (23) reported that a polypeptide 
consisting of the C-terminal 33 residues of NS4A and NS4B 
facilitated fraw-cleavage at the 4B/5A site. Although flanking 
sequences may contribute to NS4A activity, these data suggest 
that a 14-residue segment (residues 22 to 35) of NS4A may be 
critical for cleavage at the 4B/5A site. As shown in Fig. 9, the 
HCV NS4A protein sequence is highly conserved among HCV 
strains and consists of a hydrophobic N-terminal portion a 
central region implicated in fra/u-cleavage at the 4B/5A site 
(highlighted in Fig. 9), and a highly charged acidic C-terminal 
segment. Although somewhat less conserved than other re- 
gions of NS4A (especially in comparison with HCV-J6 and 
HCV-J8), this central region contains two positively-charged 
residues, several hydrophobic amino acids, and an absolutely 
conserved Gly at position 27. The importance of these residues 
for NS4A taw-cleavage activity is currently being tested by 
site-directed mutagenesis. 
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The lack of (raw-cleavage at the 3/4A site in previous studies 
led to the suggestion that cleavage at this site occurred in cis (5, 
69). This cleavage has recently been shown to be insensitive to 
dilution, providing direct evidence for a cis mechanism (6). 
These observations are consistent with the results of pulse- 
chase analyses in which we (this report) and others (6) were 
unable to detect NS3-related precursors, Thus, in the current 
model, both the 2/3 and 3/4A cleavages are catalyzed by two 
distinct viral proteinases in cis. Of interest is the observation 
that substrates with an inactivated serine proteinase domain 
were resistant to /ram-cleavage at the 3/4A site. Efficient 
flww-cleavage was observed, however, when the inactivated 
proteinase domain was deleted. Although other possibilities 
exist, these results, together with the observation that NS4A 
sequences are required for cleavage at the 3/4A site (23), 
suggest that during translation of the poryprotein, the serine 
proteinase domain interacts with nascent NS4A to assume a 
conformation capable of cw-cleavage at the 3/4A site. In the 
case of the substrate with the inactivated proteinase, this 
intermediate still forms but is frozen because it is inactive for 
cw-cleavage. Thus, the 3/4A site of this substrate is not 
accessible to trans-acting proteinase, because it is probably 
bound in the substrate binding pocket of the inactive autopro- 
teinase. 

In contrast to the rapid cleavages observed at the 2/3 and 
3/4A sites, processing at the 4A/4B, 4B/5A, and 5A/5B sites 
was slower and appeared to involve multiple pathways (this 
study and reference 6). An obligate processing order was not 
observed, which is consistent with results from a study in which 
mutations blocking cleavage at each of these three sites had no 
significant effect on processing at other sites (40). Similar 
results have been obtained for flavfviruses (45, 46, 52, 56). It is 
important to emphasize that the processing pathways and 
kinetics observed in mammalian transient expression assays 
may not accurately reflect the situation in HCV-infected cells. 
In particular, *ra/t?-processing reactions, which are important 
for temporal regulation of RNA synthesis for other viruses (for 
example, see reference 44), would be expected to be sensitive 
to the concentration of transacting factors, which may be 
much lower in HCV-infected cells. Hence, these issues should 
be reexamined when systems become available for studying 
HCV replication in cell cultures. 

Using both the vaccinia vims-T7 and the Sindbis virus 
replicon expression systems, we found that the NS5B protein 
was unstable compared with the other poryprotein cleavage 
products (8) (Fig. 7 and 8 and data not shown). Turnover of 
NS5B was similar whether the protein was expressed as part of 
the full-length poryprotein or independently as a ubiquitin 
fusion protein. In contrast to the results in cell culture assays, 
NS5B was found to be relatively stable in reticulocyte lysates. 
Since NS5B is the putative HCV RNA-dependent RNA 
polymerase (51), down-regulation of this protein could play an 
important regulatory role in virus replication, as has been 
found for the RNA polymerase of alphaviruses (see reference 
66 for a review). For the pestivirus bovine viral diarrhea virus, 
the putative RNA-dependent RNA polymerase (p75) is unsta- 
ble, with a half-life of less than 60 min in bovine viral diarrhea 
virus-infected cells (21). However, the NS5 protein of flavivi- 
ruses, which is not cleaved into two proteins, is relatively stable 
(13). In contrast to our results, Bartenschlager et al. found 
NS5B (of a strain similar to HCV-J) to be quite stable when 
expressed in HeLa cells with a vaccinia virus recombinant (6). 
The reason for the discrepancy is unclear, but it could reflect a 
difference in the sequence of the expressed NS5B protein or in 
the cells used for the expression studies. As mentioned above, 



these issues need to be reexamined in a system that supports 
HCV RNA replication. V 

Finally, there is considerable interest in the HCV protein- 
ases as targets for development of new antivirus therapies. The 
general usefulness of such compounds will depend in part on 
their ability to inhibit the proteinases of diverse HCV types. 
Although it will be important to test more divergent protein- 
ase-substrate combinations, the ability of the HCV-H serine 
proteinase (subtype la) to frww-process an HCV-BK substrate 
(subtype lb), and vice versa, suggests that the essential ele- 
ments of recognition may be conserved. This is encouraging for 
the development of broadly effective serine proteinase inhibi- 
tors. 
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The Fldvivirus genus, family Flavi viri- 
dae, consists of a group of some 70 
closely related human or veterinary 
pathogens causing many serious illness- 
es, including dengue fever, Japanese en- 
cephalitis, St. Louis encephalitis, Mur- 
ray Valley encephalitis, tick-borne en- 
cephalitis, and yellow fever (/). Most 



fever was spread by ship to ports as far 
north as Boston and as far east as En- 
gland, where mortality rates in an epi- 
demic could exceed 20 percent of those 
contracting the disease. Walter Reed and 
colleagues in pioneering studies in Cuba 
in 1 900 demonstrated that yellow fever is 
transmitted by mosquitoes, and 2 years 



Abstract. The sequence of the entire RNA genome of the type flavivirus, yellow 
fever virus, has been obtained. Inspection of this sequence reveals a single long open 
reading frame of 10,233 nucleotides, which could encode a polypeptide of 3411 
amino acids. The structural proteins are found within the amino-terminal 780 
residues of this poly protein; the remainder of the open reading frame consists of 
nonstructural viral polypeptides. This genome organization implies that mature viral 
proteins are produced by postradiational cleavage of a poly protein precursor and 
has implications for flavivirus RNA replication and for the evolutionary relation of 
this virus family to other RNA viruses. 



flaviviruses are transmitted to vertebrate 
hosts by blood-sucking arthropods, mos- 
quitoes or ticks, although some evidently 
lack an arthropod vector (2). Arthropod- 
transmitted flaviviruses replicate in the 
arthropod host as well as the vertebrate 
host. Human flavivirus diseases have 
diverse and complex pathologies and dif- 
ferent viruses exhibit marked tissue tro- 
pisms. Many are neurotropic, causing 
encephalitic symptoms; others, such as 
the dengue group, replicate preferential- 
ly in host macrophages, whereas yellow 
fever is usually viscerotropic. 

The disease known as yellow fever has 
been recognized for several hundred 
years {3, 4). Until the early 1900's recur- 
rent epidemics occurred in the Caribbe- 
an area which caused great human suf- 
fering and had a profound influence on 
human activities in the area. From its 
focus in the Caribb ean, epidemic yellow 
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later showed that the disease agent is 
filterable (5). With the recognition that 
the mosquito Aedes aegypti is the vector 
for urban yellow fever, mosquito control 
measures rapidly led to the elimination 
of urban yellow fever. Subsequently, a 
safe and effective attenuated vaccine 
strain (17D) was developed by in vitro 
passage of the virulent Asibi strain in 
chicken embryo tissue (6). However, the 
virus persists in a sylvan cycle in the 
forests of South America and Africa, 
transmitted by numerous mosquito spe- 
cies including those of the genus Haema- 
gogus in South America and of the genus 
Aedes in Africa. The vertebrate hosts in 
this cycle appear to be almost exclusive- 
ly primates, demonstrating the limited 
natural host range of yellow fever. From 
the sylvan cycle periodic outbreaks in 
neighboring human populations have 
.arisen on both continents. Furthermore, 
since Aedes aegypti is widespread in the 
world, a situation exacerbated by relax- 
ation of mosquito abatement procedures 
in the Caribbean and elsewhere, the po- 
tential exists for future epidemics of ur- 
ban yellow fever. 



Previous studies have shown that fla- 
viviruses contain single-stranded infec- 
tious RNA (thus defining them as plus- 
stranded RNA viruses in which the viri- 
on RNA serves as a messenger) encapsi- 
dated in a nucleocapsid possessing 
icosahedral symmetry and containing a 
single species of capsid protein [C, ap^ 
parent mass of about 14 kilodaltons 
(kD)]. This in turn is surrounded by a 
lipid bilayer containing an envelope pro- 
tein (E; about 50 to 60 kD) that is usually 
but not invariably glycosylated (7) and a 
second, nonglycosylated protein (M; 
about 8 kD) (5, 9). How the envelope is 
obtained is unclear, as budding flavivi- 
ruses are seldom identified in electron 
microscopic studies, although matura- 
tion does appear to occur in association 
with intracellular membranes (9, 10). 
Replication of flaviviruses in tissue cul- 
ture is slow, with a long latent period, 
and only moderate titers of virus are 
produced. Host cell protein and RNA 
synthesis are shut off only poorly (verte- 
brate cells) or not at all (mosquito cells), 
making study of flavivirus replication 
and structure somewhat more difficult. 
Virus-specific protein synthesis appears 
to be associated with the rough endo- 
plasmic reticulum, and RNA replication 
is localized in the perinuclear region (11). 
No subgenomic RNA has been detected 
in cells infected with flaviviruses, and it 
is believed that the genomic length RNA 
which is capped but not polyadenylated 
(12, 13) is the only messenger RNA 
(mRNA) species (9, 12, 14). This mRNA 
is translated into the three structural 
proteins and several nonstructural pro- 
teins. Translation of the flavivirus 
genome in vitro produces polypeptides 
related to the structural proteins (15) 
which, in the presence of appropriate 
membrane fractions, can be processed 
efficiently to yield C and E (16). Peptide 
mapping of in vitro translation products 
as well as selective incorporation of N- 
formylmethionine suggest that initiation 
in vitro occurs only with the capsid pro- 
tein. Alternatively, studies on the in vivo 
translation of flavivirus Kunjin have 
been based on the use of pactamycin or 
high salt inhibition of translation initia- 
tion (17) or ultraviolet inactivation of 
translation (18) in an attempt to map the 
genome order of flavivirus proteins on 
the assumption that there is just a single 
site for initiation of translation. These 
experiments have led Westaway and col- 
laborators to suggest that multiple inde- 
pendent translation initiation sites are 
used within flavivirus RNA, a situation 
not typically found with other eukaryotic 
mRNA's (19). 
We now present the complete nucleo- 
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tide sequence of the yellow fever 
genome determined from complemen- 
tary DNA (cDNA) clones of the 17D 
vaccine strain. Together with recent 
NH 2 -terminal sequence analysis of both 
structural (20) and some nonstructural 
yellow fever proteins, the amino acid 
sequences of the encoded proteins have 
been deduced and a preliminary picture 
of flavivirus gene organization and 
expression has begun to emerge. 

Sequence of yellow fever RNA. The 
complete sequence of yellow fever RNA 
is shown in Fig. 1. The 5'- and 3'- 
terminal sequences presented were de- 
rived from several independent clones, 
are homologous to the 5' and 3' termini 
of West Nile flavivirus genomic RNA 
(27) (see below), and thus probably re- 
flect the extreme ends of the yellow 
fever genome. Given these assumptions, 
the RNA genome is 10,862 nucleotides in 
length and has a mass of 3.75 x 10 6 
daltons (expressed as the sodium form). 
Previous reports have shown that flavi- 
virus genomic RNA contains a type 1 
cap at the 5' terminus but lacks a polya- 
denylate tract at the 3' terminus (12, 13). 
The base composition of the RNA is 27.3 
percent A, 23.0 percent U, 28.4 percent 
G, arid 21.3 percent C. 

It is striking that the RNA contains an 
extremely long open reading frame, 
which spans virtually the entire length of 
the genome. This open reading frame, 
beginning from the first AUG triplet, is 
10,233 nucleotides in length, terminating 
with a single opal codon (UGA), and 
could encode a polypeptide of 380,763 
daltons, leaving 5'- and 3'-noncoding re- 
gions of 118 and 511 nucleotides, respec- 
tively. Examination of the remaining five 
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Fig.. 2/ Organization and processing of proteins encoded by the yellow fever genome. 
Untranslated regions are shown as single lines and the translated region as an open box. The 
open triangle is the initiation codon (AUG); tne solid diamond the termination codon (UGA) 
The protein nomenclature is described in Table 1 and (S3). The single letter amino acid code is 
used for sequences flanking assigned cleavage sites (solid lines). Two other potential cleavage 
sites are shown as dotted lines. Structural proteins, identified nonstructural proteins, and 
hypothesized nonstructural proteins (see text) are indicated by solid, open, and hatched boxes, 
respectively. Other potential cleavage sites have been found and are described in Table 1 ' 
footnote asterisk. 



possible reading frames (two in the viri- 
on RNA and three in the complementary 
RNA) reveals multiple stop codons in 
every case, with the longest possible 
other open reading frame being 804 nu- 
cleotides (in the complementary strand). 
Thus there is no reason to expect that 
any protein is translated from yellow 
fever RNA other than the polyprotein 
encoded by the long open reading frame 
shown in Fig. 1. 

The structural proteins of yellow fever 
virus. The start points of the three yel- 
low fever virus structural proteins (C, M, 
and E) have been positioned within the 
translated RNA sequence from NH 2 -ter- 
minal amino acid sequences obtained for 
the structural proteins isolated from yel- 
low fever virions (20) (Fig. 1). The capsid 
protein is the first protein found in the 
long open reading frame and begins one 
residue past the first methionine. Thus, 



in agreement with in vitro translation 
data from the flavivirus genomic RNA's 
of tick-borne encephalitis virus, West 
Nile virus, and Kunjin virus (75, 16), the 
translation of the yellow fever genome 
initiates with the capsid protein, and the 
NH 2 -terminal methionine is removed 
during maturation of the protein (20). 
The capsid protein may be released from 
the precursor polyprotein by cleavage at 
or just past a series of basic amino acids 
(Figs. 1 and 2), From this deduced amino 
acid sequence, the capsid protein is quite 
basic containing about 25 percent lysine 
and arginine distributed throughout the 
protein. The capsid protein of tick-borne 
encephalitis virus contains a similar pro- 
portion of basic amino acids (22). Since 
the capsid protein forms complexes with 
the RNA, its highly basic character prob- 
ably acts to neutralize some of the RNA 
charges in such a compact structure. 



Fig. 1 (preceding page and opposite page). Entire sequence of the genome of yellow fever virus. Yellow fever virus 17D vaccine strain wn* 
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DoomtI^^?h ty Pl °u ? f the y e j ,ow fcvcr Polyprotein sequence. The program of Kyte and 
Doolittle (54) wnh a search length of seven amino acids was used. Cleavage sites localized Ibv 
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There is also a hydrophobic stretch of 16 
uncharged amino acids beginning with 
residue 42 from the NH 2 terminus (see 
Fig. 3), which is conserved among flavi- 
viruses (23) and may be involved in 
protein-protein or specific protein-RNA 
interactions (or both) which assemble 
the nucleocapsid and lead to acquisition 
of the lipoprotein envelope by the cap- 
sid. 

The start point of the virion M protein 
is also shown in Fig. 1. This protein 
contains a charged NH 2 -terminal domain 
and two long uncharged stretches at its 
COOH terminus; these two stretches are 
separated by a single basic residue (Figs. 
1 and 3) and could act as membrane 
spanning anchors similar to those ob- 
served in many virus envelope proteins. 
Protein M has not been identified in 
infected cells and is postulated to be 
derived from a precursor glycoprotein 
which we call prM (Table 1), which is 
also called by others GP23, GP19, or 
NV2 (8, 24). The sequence data support 
this hypothesis. A possible start point of 
prM, as deduced by limited homology 
with the NH r terminal sequence of the 
flavivirus St. Louis encephalitis NV2 
(20) and homology in this region with 
Murray Valley encephalitis virus (23), 
follows the capsid protein; prM may 
begin with an uncharged stretch of amino 
acids which could function as an NH 2 - 
terminal signal sequence for its cotrans- 
lational insertion into the endoplasmic 
reticulum (Fig. 3). After this hydropho- 
bic domain, which may or may not be 
removed by signalase, the prM sequence 
contains three possible glycosylation 
sites of the type Asn-X-Ser/Thr, The 
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NH 2 terminus of M (20) follows the 
sequence Arg-Ser-Arg-Arg in prM, indi- 
cating that the cleavage to produce M 
may be effected by the same enzyme that 
cleaves a number of viral envelope pre- 
cursors at the sequence Arg-X-Arg/Lys- 
Arg and that has been postulated to be a 
host protease localized in the Golgi appa- 
ratus or Golgi-derived vesicles (25), per- 
haps similar to the cathepsins (26). As a 
result of this cleavage, which apparently 
occurs late during virus maturation and 
release, an 1 1 .4-kD (not including carbo- 
hydrate) glycopeptide would be removed 
leaving the nonglycosylated M protein 
embedded in the virion membrane. 
Trace quantities of small virus-specific 
glycoproteins have been detected in cy- 
toplasmic extracts (27, 28, 29), but 
whether the glycopeptide fragment re- 
mains cell-associated and is rapidly de- 
graded or is released into the extracellu- 
lar medium is unknown. 

The E protein follows M. The NH 2 
terminus of E is charged, and the more 
hydrophobic COOH-terminal domain of 
M (or its precursor, prM) may function 
as the signal sequence for the transloca- 
tion of E across the rough endoplasmic 
reticulum. The protein E contains two 
Sites of the form Asn-X-Ser/Thr which 
could serve as carbohydrate attachment 
sites, and both glycosylated and nongly- 
cosylated forms have been detected in 
infected cells (7, 27, 30). The COOH- 
terminal domain of E contains uncharged 
stretches that could serve as a trans- 
membrane anchor. Cleavage between M 
and E occurs after a Ser residue, and 
could be catalyzed by a host protease 
such as signalase. Since the COOH ter- 



minus of the mature M protein has not 
been determined, a small peptide, analo- 
gous to the 6 kD protein of alphaviruses 
(25, 57) could be produced during matu- 
ration of M and E. However, the appar- 
ent size of the M protein agrees well with 
the predicted molecular weight if cleav- 
age occurs after the Ser at position 285. 

This model for translation and process- 
ing of structural proteins and the features 
mentioned above predict that most of the 
E protein and some of the M protein 
should be exposed on the mature virion 
surface, and therefore sensitive to diges- 
tion by appropriate proteases. Protease 
digestion of purified tick-borne encepha- 
litis virus (32) and also yellow fever virus 
(29) support this hypothesis. Thus, the M 
protein (or prM) of flaviviruses is an 
integral membrane protein and may in- 
teract specifically with both the E pro- 
tein as well as the capsid protein-RNA 
complex during virus assembly. 

The nonstructural proteins. In addi- 
tion to prM, at least four and as many as 
12 nonstructural proteins have been de- 
scribed in flavivirus-infected cells (9, 28, 
33, 34). Some or all of these proteins 
must be active in the replication of the 
viral RNA. The start points of the three 
largest nonstructural proteins (NV3, 
NV4, and NV5 by the old nomenclature) 
(35) have been located by NH 2 -terminal 
amino acid sequence analysis (36). As 
previously suggested by peptide map- 
ping of the corresponding nonstructural 
proteins from other flaviviruses (9, 15, 
34), the sequence data show that these 
proteins map to nonoverlapping seg- 
ments in the yellow fever virus nonstruc- 
tural region (Figs. 1 and 2). 

In an attempt to simplify the descrip- 
tion of flavivirus encoded nonstructural 
polypeptides, in particular the smaller 
proteins, we suggest a modified nomen- 
clature (35) (Table 1) based on the linear 
order of these proteins in the yellow 
fever virus genome to complement desig- 
nations based on their apparent molecu- 
lar weights (37). In taking this approach 
we assume that members of Flaviviridae 
will have similar . genome organization 
and express homologous proteins from 
homologous regions of their genomes. 
This assumption has been partially veri- 
fied by an extensive sequence compari- 
son of yellow fever virus with another 
member of the flavivirus genus, Murray 
Valley encephalitis virus (23). 

Several features of the yellow fever 
virus nonstructural region are apparent 
from the localization of NS1, NS3, and 
NS5 (formerly NV3, NV4, and NV5). 
First, NS1 immediately follows the puta- 
tive transmembrane segment of the E 
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protein. It should be noted that NS1 is 
glycosylated (27), and monoclonal anti- 
bodies against NSI are capable of medi- 
ating complement-dependent lysis of yel- 
low fever virus-infected cells, suggesting 
Us presence at the plasma membrane 
(38). Thus, the COOH-terminal un- 
charged hydrophobic sequence of E 
could function as a signal sequence for 
translocation of NSI across the endo- 
plasmic reticulum. NSI contains two 
sites of the type Asn-X-Ser/Thr which 
could serve as glycosylation sites. The 
probable COOH terminus of NSI from 
estimates of molecular weight could con- 
tain a hydrophobic sequence for anchor- 
ing the protein in the membrane (Fig. 3). 
Thus the three glycoproteins of yellow 
fever virus, prM, E, and NSI, are adja- 
cent to one another in the genome and 
are possibly inserted into the membrane 
one after another during synthesis. The 
sequence data support the hypothesis 
that each has the usual membrane pro- 
tein topology of an NH 2 terminus out- 
side and a COOH-terminal hydrophobic 
anchor. However, additional experi- 
ments are required to rigorously estab- 
lish their orientation with respect to the 
lipid bilayer and exact COOH termini. 
The function of NSI is unknown, but it 
could be involved in virus assembly rath- 
er than RNA replication. In this regard, 
it is of interest that NSI has been shown 
to be the soluble complement-fixing anti- 
gen for dengue 2 (28) and suggestive 
evidence exists for a comparable role of 
NSI in yellow fever virus infection (8, 
27), Thus, this protein may exist in alter- 



native membrane-associated and soluble 
forms, perhaps because of the presence 
or absence of the COOH-terminal hydro- 
phobic domain. 

NS3 begins at residue 1485 in the 
polyprotein sequence and is produced by 
cleavage at the site Gly-Ala-Arg-Arg j 
Ser; the NH 2 -terminus of NS5 has been 
tentatively identified as residue 2507 af- 
ter cleavage at Thr-Gly-Arg-Arg J, 
Gly. Since no host proteases with this 
specificity (which are active in the cyto- 
sol) have been characterized and animal 
viruses often encode proteases active in 
the processing of their cytoplasmic poly- 
protein precursors, yellow fever virus 
may encode a protease that cleaves after 
two Arg residues (or two basic residues) 
surrounded by amino acids with short 
side chains, often Gly (Table 1 and foot- 
note asterisk). 

These assignments leave two regions 
in the polyprotein for which polypeptide 
products have not yet been identified. 
Assuming that other nonstructural pro- 
teins will be produced from these regions 
by the same protease responsible for 
NH r terminal cleavage of NS3 and NS5, 
we have scanned the remaining se- 
quences for additional cleavage sites. 
Estimates of molecular weight (27) have 
positioned the COOH terminus of NSI 
near residue 1187. The next potential 
cleavage sequence, Gly-Arg-Arg i Ser, 
at residue 1355 would produce two small 
nonstructural polypeptides of approxi- 
mately 18 kD (ns2a) and 14 kD (ns2b) 
located between NSI and NS3 (Fig. 2 
and Table 1). Both of these polypeptides 



would be extremely hydrophobic (Fig. 3) 
with ns2b containing a short internal 
charged domain. The putative cleavage 
at the sequence Glu-Gly-Arg-Arg I Gly 
(residue 2108) would produce a polypep- 
tide whose calculated mass agrees well 
with the observed size of NS3 on poly- 
acrylamide gels (27, 29). Between this 
site and the NH 2 terminus of NS5 a 
single potential cleavage site (Ala-Gln- 
Arg-Arg J Val) is found preceding resi- 
due 2395. Cleavage here would result in 
two methionine-rich, hydrophobic poly- 
peptides of 31 kD (ns4a) and 12 kD 
(ns4b) (see Figs. 2 and 3 and Table 1). 
Polypeptides of these approximate sizes 
(10, 14, 18, and 30 kD) do exist in yellow 
fever-infected cells, but definitive map- 
ping of these polypeptides as well as 
other minor species await additional 
NH 2 -terminal sequence data. Similarly 
in the absence of COOH-terminal se- 
quence data we cannot be sure of the 
exact terminal residues. Some heteroge- 
neity in flavivirus polypeptides may re- 
sult from variable exopeptidase digestion 
of the COOH-terminal residues or alter- 
native internal cleavages. The predicted 
size of NS5, if the protein encompasses 
the remainder of the open reading frame, 
agrees well with its observed size (27). 

Implications for flavivirus replication. 
It has been suggested that flavivirus 
RNA is translated by multiple internal 
initiation events (/7, 18) which would 
make flaviviruses atypical among eu- 
karyotic viruses and eukaryotic genes. 
The presence of a single long open read- 
ing frame in yellow fever virus RNA, the 



Table I. Flavivirus polypeptides. 



Protein nomenclature (35) 
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prM 

M 

E 



NSI 
ns2a 
ns2b 
NS3 
ns4a 
ns4b 
NS5 



Old 



V2 (NVm) 
(NV2) (NVlVz) 
VI 
V3 
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NV4 
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terminal 
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she* 
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SRR i A 
AYS i A 

VGA i D 
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prcd. 



Structural region 

13,000 to 16,000 n,320 

19,000 to 23,000 20,925 

8,000 to 8.500 8,526 

51,000 to 60,000 53,712 

Nonstructural region 

44,000 to 49,000 45,869 

16,000 to 21,000 18,086 

12,000 to 15,000 13 823 

67,000 to 76,000 69*319 

24,000 to 32,000 3IJ96 

10,000 to 11,000 12,159 

91,000 to 98,000 104,079 



Glyco- 
sylated? 



No 
Yes 
No 
Both forms§ 

Yes 
No 
No 
No 
No 
No 
No 



Comments 



Nucleocapsid protein 
Precursor to M 
Virion envelope protein 
Major virion envelope protein 

Soluble complement-fixing antigen 
Hydrophobic; function unknown 
Hydrophobic; function unknown 
Replicase component ? 
Hydrophobic; function unknown 
Hydrophobic; function unknown 
Replicase component ? 
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fact that the final proteins found do not 
initiate with methionine but appear to 
arise from a consistent set of proteolytic 
cleavages, the gene order deduced from 
the pactamyoin runoff experiments of 
Westaway (77), the in vitro translation 
data (15, 16), and recent evidence for 
polyprotein precursors (39) all support 
the view that translation of the flavivirus 
genome in vivo initiates with the capsid 
protein near the 5' end of the genome 
and proceeds sequentially through the 
genome to produce one precursor poly- 
protein. 

Cleavage of this precursor is rapid and 
occurs during translation so that the pre- 
cursor is not seen in its entirety. The 
location and frequency of characteristic 
cleavage sites in this precursor suggest 
that processing involves both virus en- 
coded and cellular organelle bound pro- 
teases. Although internal translation ini- 
tiation cannot be formally excluded, the 
5' terminal location of the structural 
genes and the 3' terminal replicase genes 
implies that the relative amounts of 
structural and nonstructural gene prod- 
ucts could also be regulated by prema- 
ture termination as well as by nonuni- 
form rates of translation (40) or differen- 
tial stability of the final products. A 
potential secondary structure in yellow" 
fever RNA just past the structural pro- 
tein genes could possibly be active in the 
former mechanism. It is unclear why 
gene mapping experiments with ultravio- 
let light to inactivate translation (18) or 
high salt to synchronize initiation of 
translation (17) suggest multiple indepen- 
dent sites of initiation and do not allow 
prediction of the correct gene order. 
Possible explanations are that ribosomes 
might have slow transit velocities in 
some areas, due to RNA secondary 
structures or the presence of rare codons 
(40) t or that it might be necessary to 
translate a functional protease to pro- 
duce the final products. 

Several features potentially important 
in RNA replication or packaging (or . 
both) can be identified in the genomic 
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sequence. First, the extreme 5'- and 3'- 
terminal sequences are homologous to 
those found for another flavivirus, West 
Nile virus (21) (Fig. 4), and the comple- 
ment of the 5 '-terminal sequence [equiv- 
alent to the 3' terminus of the (-) strand] 
is related to the 3 '-terminal sequence of 
the (+) strand. This suggests that the 
viral replicase may have similar recogni- 
tion sites for (+) and (-) strand synthe- 
sis. In addition, a stable secondary struc- 
ture (AG = -40 to 45 kcal) can be 
formed from the 3'-terminal 87 nucleo- 
tides of the yellow fever genomic RNA 
(Fig. 5). This may be involved in RNA 
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Fig. 5. Possible secondary structures at the 3' 
terminus of yellow fever vinis genomic RNA. 
Circled nucleotides are shared with the 3' 
terminus of the yellow fever (-) strand (see 
Fig. 4). AG values were calculated according 
to Tinoco et at. (55). A more stable conforma- 
tion than the one shown (form l) can be 
formed if the two overlined sequences are 
base paired (form 2). 



replication as well as encapsidation, and 
if conserved among flaviviruses could 
explain the observation that many flavi- 
virus RNA's are poor substrates for 3'- 
terminal enzymatic modification includ- 
ing ligation and addition of poiy(A) (po- 
lyadenylate). Similar transfer RNA-like 
secondary structures and conserved se- 
quences have been identified at the 3' 
end of many plant viral RNA's (41)\ in 
addition to serving as substrates for 
aminoacylation both in vivo and in vitro 
(42) they are important for initiation of 
(-) strand RNA synthesis (43). Last, the 
3 '-untranslated region contains a set of 
three closely spaced repeated sequences 
(underlined in Fig. 1) (located between 
nucleotides 10,374 and 10,520) each ap- 
proximately 40 nucleotides long with an 
average of six changes between them in 
pairwise comparisons. The significance 
of these repeats in flavivirus replication 
is unknown. 

Evolution of flaviviruses . It is becom- 
ing clear that the flaviviruses deserve 
their recent reclassification as a family 
separate from the alphavimses. Although 
the mature virions are morphologically 
similar to alphavimses in that they have 
a single-stranded RNA (+) sense 
genome encapsidated in an icosahedral 
nucleocapsid and surrounded by a lipid 
bilayer containing virus-specified poly- 
peptides, they differ markedly in genome 
organization and replication strategy 
(44). The location of the genes encoding 
the structural proteins at the 5' end of the 
genome, the single long reading frame, 
and the lack of a subgenomic message 
are all characteristics shared with picor- 
naviruses rather than togaviruses. 

In order to understand the evolution- 
ary role of flaviviruses and their relation 
to other RNA viruses we have searched 
for homologies within the putative 
polymerase genes of various plant and 
animal viruses. Significant homologies 
have been found between alphavimses 
and plant vimses (45) and less extensive 
homologies between picornaviruses and 
alphavimses (46). Kamer and Argos (46) 
have aligned the polymerase gene of 
poliovirus with those of several vimses 
including alfalfa mosaic vims, brome- 
grass mosaic vims, tobacco mosaic vi- 
ms, Sindbis vims, foot and mouth dis- 
ease vims, encephalomyocarditis vims, 
and cowpea mosaic vims. The amino 
acid sequence of yellow fever vims NS5 
between residues 3037 and 3181 can also 
be aligned with this collection of diverse 
RNA vimses (Fig. 1). These homologous 
regions are convincing but short and 
probably represent conserved functional 
domains for particular RNA-dependent 
polymerase functions. It is interesting to 
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speculate on the origin of this diverse 
group of viruses. Whether they arose 
from one or a few protoviruses (perhaps 
insect viruses) and have radiated to their 
current divergent hosts or whether the 
viruses have repeatedly cannibalized 
their hosts, obtaining their replicases 
from eukaryotic cellular functions can- 
not be resolved at present. However, 
one possible measure of host adaptation 
or origin of viral genes from host func- 
tions is the CG doublet frequency in the 
RNA. Insects, insect viruses, and alpha- 
viruses (insect-borne with vertebrate 
hosts) have the expected CG doublet 
frequency predicted from their base 
compositions (47), whereas vertebrate 
DNA (48), viruses with exclusively ver- 
tebrate hosts, and yellow fever virus 
have low CG doublet frequencies (2.4 
percent CG found in yellow fever com- 
pared to 6.1 percent predicted from the 
base composition). Given the rapid evo- 
lution of RNA genomes, it is unlikely 
that this difference applies directly to the 
question of evolutionary origin of alpha- 
viruses and flaviviruses but rather re- 
flects alternative strategies of adaptation 
to their arthropod and vertebrate hosts in 
ways which are not currently under- 
stood. 

Comparative studies with other flavi- 
viruses should help to define areas of 
commonality of function in the nonstruc- 
tural proteins, to localize biologically 
important antigenic epitopes on the 
structural polypeptides (and NS1) and to 
ascertain whether certain features of the 
yellow fever sequence (like the putative 
secondary structure at the extreme 3' 
terminus and repeated nucleotide se- 
quences) are functionally significant 
landmarks conserved among flavivi- 
ruses. In addition, the construction of 
cDNA clones designed for expression of 
functional virus gene products or pro- 
duction of infectious virus should pro- 
vide useful new approaches for studying 
flavivirus molecular biology and patho- 
genesis as well as for development of 
flavivirus vaccines. 
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conditions described supra, for the expression of the fusion polypeptide, C100-3. The resulting polypeptides 
are screened using the sera from individuals with NANBH, described supra, for the screening of im- 
munogenic polypeptides encoded In HCV cDNAs expressed In E. coll. 



6 



Comparison of the Hydrophobic Profiles of HCV Poiyprotelns with West Nile Virus Polypr otein and with Dengue 

Virus NS1 



w 



The hydrophobicity profile of an HCV polyprotein segment was compared with that of a typical 
Flavivirus, West Nile virus. The polypeptide sequence of the West Nile virus polyprotein was deduced from 
the known polynucleotide sequences encoding the non-structural proteins of that virus. The HCV poly- 
protein sequence was deduced from the sequence of overlapping cDNA clones. The profiles were 
determined using an antigen program which uses a window of 7 amino acid width (the amino acid in 
is question, and 3 residues on each side) to report the average hydrophobicity about a given amino acid 
residue. The parameters giving the reactive hydrophobicity for each amino acid residue are from Kyte and 
Doolittle (1982). Rg. 19 shows the hydrophobic profiles of the two poiyprotelns; the areas corresponding to 
the non-structural proteins of West Nile virus, ns1 through ns5, are indicated in the figure. As seen in the 
figure, there is a general similarity in the profiles of the HCV polyprotein and the West Nile virus 
20 polyprotein. 

The sequence of the amino acids encoded in the 5-regfon of HCV cDNA shown in Fig. 16 has been 
compared wHh the corresponding region of one of the strains of Dengue virus, described supra with 
respect to the profile of regions of hydrophobicity and hydrophilicity (data not shown). This comparison 
indicated that the polypeptides from HCV and Dengue encoded in this region, which corresponds to the 
25 region encoding NS1 (or a portion thereof), have a similar hydrophobic/hydrophilic profile. 

The similarity in hydrophobicity profiles, in combination with the previously identified homologies in the 
amino acid sequences of HCV and Dengue Flavivirus In EP 0,218.316 suggests that HCV is related to these 
members of the Flavivirus family. 
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Characterization of the Putative Polypeptides Encoded Within the HCV ORF 

The sequence of the HCV cDNA sense strand, shown in Rg. 17, was deduced from the overlapping 
HCV cDNAs in the various clones described in EPO Pub. No. 318.216 and those described supra, it may be 
deduced from the sequence that the HCV genome contains primarily one long continuous ORF, which 
encodes a polyprotein. In the sequence, nucleotide number 1 corresponds to the first nucleotide of the 
initiator MET codon; minus numbers indicate that the nucleotides are that distance away in the 5-direction 
(upstream), while positive numbers indicate that the nucleotides are that distance away in the 3'-direction 
(downstream). The composite sequence shows the "sense" strand of the HCV cDNA. 

The amino acid sequence of the putative HCV polyprotein deduced from the HCV cDNA sense strand 
sequence is also shown In Rg. 17, where position 1 begins with the putative initiator methionine. 

Possible protein domains of the encoded HCV polyprotein, as well as the approximate boundaries are 
the following (the polypeptides identified within the parentheses are those which are encoded in the 
Flavivirus domain): 
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Putative Domain 


Approximate 




Boundary 




(amino acid 




nos.) 


"C" (nucloocapsid protein) 


1-120 


"E" (Virion envelope proteln(s) and possibly matrix (M) proteins 


120-400 


"NS1 " (complement fixation antigen?) 


400-660 


"NS2 W (unknown function) 


660-1050 


"NS3" (protease?) 


1050-1640 


"NS4" (unknown function) 


1640-2000 


"NS5" (polymerase) 


2000-? end 



It should bo noted, however, that hydrophobic^ profiles (described infra), indicate that HCV diverges 
from the Ravivirus model, particularly with respect to the region upstream of NS2. Moreover, the 
boundaries indicated are not intended to show firm demarcations between the putative polypeptides. 



The Hydrophilic and Antigenic Profile of the Polypeptide 



Profiles of the hydrophilicity/hydrophobicity and the antigenic index of the putative polyprotein encoded 
in the HCV cDNA sequence shown in Fig. 16 were determined by computer analysis. The program for 
hydrophilicity/hydrophobicity was as described supra. The antigenic index results from a computer program 
which relies on the following criteria: 1) surface probability, 2) prediction of alpha-helicity by two different 
methods; 3) prediction of beta-sheet regions by two different methods; 4) prediction of U-turns by two 
different methods; 5) hydrophilicity/hydrophobicity; and flexibility. The traces of the profiles generated by 
the computer analyses are shown in Fig. 20. In the hydrophilicity profile, deflection above the abscissa 
indicates hydrophilicity, and below the abscissa indicates hydrophobicity. The probability that a polypeptide 
region is antigenic is usually considered to increase when there is a deflection upward from the abscissa in 
the hydrophilic and/or antigenic profile, it should be noted, however, that these profiles are not necessarily 
indicators of the strength of the immunogenicity of a polypeptide. 



Identification of Co-linear Peptides in KCV and Ravivlruses 

The amino acid sequence of the putative polyprotein encoded in the HCV cDNA sense strand was 
compared with the known amino acid sequences of several members of Flaviviruses. The comparison 
shows that homology is slight, but due to the regions in which K is found, it is probably significant The 
conserved colinear regions are shown in Rg. 21. The amino acid numbers listed below the sequences 
represent the number in the putative HCV polyprotein (See Rg. 17.) 

The spacing of these conserved motifs is similar between the Flaviviruses and HCV, and implies that 
there is some similarity between HCV and these flavivirat agents. 

The following listed materials are on deposit under the terms of the Budapest Treaty with the American 
Type Culture Collection (ATCC), 12301 Parklawn. Dr., Rockvllle, Maryland 20852, and have been assigned 
the following Accession Numbers. 
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carried out utilizing the PGR amplification procedure, as 
described in Section IV. c. 3, except that the hybridization 
probe was a kinased oligonucleotide derived from the clone 
81 cDNA sequence. The results showed that the amplified 
sequences hybridized with the clone 81 derived HCV cDNA 
probe . ■ 
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IV,H " 3 - Hoitoloqy Between the W on-structur-al Protein of 
Dengue Flavivirus fMN WWVDl> and the HCV Polyp ^^., 
Encoded by the Combine 0RF of clones 141 Thronry h 

The combined HCV cDNAs of clones 14i through 39c 
contain one continuous ORF , as shown in Fig. 26. The 
polypeptide encoded therein was analyzed for sequence 
homology with the region of the non-structural 
polypeptide (s) in Dengue flavivirus (MNWVD1) . The 
analysis used the Dayhoff protein data base, and was 
performed on a computer. The results are shown in Fig 
42, where the symbol (:) indicates an exact homology, and 
the symbol ( . ) indicates a conservative replacement in the 
sequence; the dashes indicate spaces inserted into the 
sequence to achieve the greatest homologies . As seen from 
the figure, there is significant homology between the 
sequence encoded in the HCV cDNA, and the non-structural 
polypeptide^ ) of Dengue virus. In addition to the homol- 
ogy shown in Fig. 42, analysis of the polypeptide segment 
encoded in a region towards the 3 '-end of the cDNA also 
contained sequences which are homologous to sequences in 
the Dengue polymerase . Of consequence is the finding that 
the canonical Gly-Asp-Asp (GDD) sequence thought to be 
essential for RHA-dependent RNA polymerases is contained 
in the polypeptide encoded in HCV cDNA, in a location 
which is consistent with that in Dengue 2 virus. (Data 
not shown.) 
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HCV genome 
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cDNA 81 
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is • is an undescribed sample. 

ON., (So. s^on N .02., LWStRiK: SSI" RN " M "* 

POM** tat HCV ^ens . n^J^T*''^- "°™«' «• 

^ Detection of Sequences in Captured Particles 

Which When Amplified by PCR-HyEHdjgetoHCV cDNA Derived from Clone fli 

The RNA in captured particles was obtained as described In Section rv w 1 -n^ » i . . 
which hybridize to the HCV cDNA darhtari t^ T^ W)on Tne analysis for sequences 
procedure, as deSribed I &SoTl C TJZSSill Ma *• PCR *»P»*catJon 

derived from the clone * ^^^1^1 I *ZT T* 6 WaS 8 Wnased °«S°nucleotide 
with the clone S1 derived I HCV cSna p"ob e ^ ** ampKfied Se( ' uences 

ii^i^ell^l S fl^g** «M - ft HCV 

P0!ypeptide(s, in Dengue flavMrus (MNV^S?. tTSS ^SSSSS^ J f e " 0 "^^ 
performed on a computer. The results am *»J! tTr. yhoff prot9,n data base - «d was 
homotogy. and the <? fnoS aLt^ Si2^ ^T^' ^l?** " ™« 
spaces inserted Into the sequence to acHM^~J?Z ^ T! 8ec " uence : dashes Indicate 
significant homology be^n" HCV cDNA^ and Zon^ Tt' " 
(s) of Dengue virus. In addition to the homology shown Jn Rg t tZtar^^^ 99 ^ 
encoded in a region towards the 3 -end of thn htwI „ w .1 y of P°'yP e P«de segment 
sequences in «» D-nQ«^ Which are homoto 9°<* to 

sequence thought to be iss entSTr" RNA^enZt rna MS 9 T ^"k^ ^P^P W 
encoded In HCV cDNA, in a location^SS 3f£&t S? SJ"nS£ ^ * 
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!Y H4 - HCV ' DN A Is Not Detectable In NANBH Infected Tissue 



provide *„ HCV Is not a ONA „ «t3Ei! c^tS^ST* 



IV.H.4A Southern Blotting Procedure 
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FIG. 41-1 

H^ L betw^^^^^ £ tide encoded by combined ORP of clones 

*fj^!l?fg>_* n d the non-structurarp^ te^f 'theTelg^n'"^ 
virus (MNWVDl). ~~ ~ 

HCV S^ 1 ^^^**^^ 
MNWVDl J™***^^ 

10U 160 170 180 

AU 220 230 240 

120 130 140 IRQ i en 

hot """"o^na^^ 

290 

tr^tr 180 190 200 210 0->n 

hcv ^^■^vevqox^^ 

3g 0 400 410 

«v 460 470 

520 530 

hcv ™»^«^ifn^^ 

■ 5 9 0 

HCV ^^NLET^PVFO^ 
MNWVDl ™^K-siED^ 

540 550 560 570 5 80 




EP 0 318 216 A1 




! * ♦ 

! • • 
• • * 



• m « 



MNWVD1 



HCV GYKVLVLNPS VAATLGFGAYMSKAHGIDPNIRTGVRTITTGSPITYSTYGKFLADGGC 

GLRpiLAPTRVVT^EMEEAIilGLPIRYQTO^ 
650 660 670 680 690 700 

590 600 610 620 630 640 

SGGAYDIIICDECHSTOATSII/3IGT^^ 



HCV 



HCV 
MNWVDl 



HCV 
MNWVDl 



MNWVDl aVP^iiMDEA^^ 



750 760 

650 660 670 680 690 700 

ALSTTGEIPFYGKMPLEVIKGGRHLIFCHSKKKCDEL 

* * • # • 9 * 

IMDEE^IPERSWSSGHEWVTDFKGKTVWFVPSIKAGITOTAACLRKNGKKVTQLS 
770 780 790 800 810 820 

710 7 20 730 740 750 760 

IPTSGDVVVVATDAIMTCYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETITLPQDAVSRT 

SEYVKTRTNDWNFWTTD ISEMGANFKAERVTDP RRCMKPVTLTDGEERVTLAG PMPVTH 
830 840 850 860 870 880 

„ mT 770 ? 80 790 800 810 820 

HCV QRRGRTGRGKPGIYRFVAPGERPSGMFDSSVIjCECYDAGCAWYELTPAETIVRLRAYMNT 

MNWVDl SS 
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ASSIGNMENT 



This Assignment is made by Tatsuo Miyamura, residing at 4-21- 
22-113, Hamadayama, Suginami-ku, Tokyo 168, Japan, and Izumu Saito, 
residing at 2-37-15-412, Yoyogi, Shibuya-ku, Tokyo 151, Japan, who 
are hereinafter collectively referred to as "Assignors." 

The Assignors have made certain new and useful inventions set 
forth in a United States patent application entitled "New HCV 
Isolate" filed in the name of the Assignors on 15 September 1989 
under attorney docket number 2300-0089, and given 

Serial No. 408 ,045 by the United States Patent and 

Trademark Office, hereinafter referred to as the "Inventions" or 
the "Patent Application", respectively. 

The Director General of the National Institute of Health of 
Japan, on behalf of the National Institute of Health of Japan, 
having a place of business located at 2-10-35 Kamiosaki, Shinagawa- 
ku, Tokyo 141, Japan, and Chiron Corporation, a Delaware 
corporation having a place of business located at 4560 Horton 
Street, Emeryville, California 94608, United States of America, 
hereinafter collectively referred to as "Assignees", desire to 
collectively obtain all rights to the Inventions, the Patent 
Application, and any letters patent, United States or foreign, 
obtained therefor or thereon. 

In consideration of the sum of one dollar ($1.00) and . other 
good and valuable consideration actually received, the Assignors 
agree to assign, and hereby do assign, transfer and set over to the 
Assignees, in equal and undivided shares, the entire right, title 
and interest in the Inventions, the Patent Application, any and all 
letters patent in the United States of America and all foreign 
countries which may be granted therefor or thereon, any and all 
continuations, divisions and continuation- in-parts of the Patent 
Application, any and all reissues and extensions of such letters 
patent, and all rights under the International Convention for the 
Protection of Industrial Property (also known as the "Paris 
Convention") arising from the Inventions and the Patent 
Application. 

For the same consideration recited above, the Assignors also: 

1) agree to execute all papers necessary in 
connection with the Patent Application and any 
continuing or divisional or reissue 
applications thereof and also to execute 
separate assignments in connection with such 
applications as the Assignees may deem 
necessary or expedient or essential to its 
full protection and title in and to the 
invention hereby transferred; 

2) agree to execute all papers necessary in 
connection with any interference which may be 
declared concerning the Patent Application or 



continuation or division or re- issue thereof 
and to cooperate with the Assignees in every 
way possible in obtaining evidence and going 
forward with such interference; 

3) agree to perform all affirmative acts which 
may be necessary to obtain a grant of a valid 
United States or foreign patent to the 
Assignees; 

4) agree to communicate to the Assignees or 
representatives thereof any facts known to 
them, testify in any legal proceedings 
regarding the Invention; 

5) authorize and request the Commissioner of 
Patents to issue any and all Letters Patents 
of the United States resulting from said 
application or any division or divisions or 
continuing applications thereof to the said 
Assignees, as the Assignees of the entire 

recorded interest, and hereby covenants that they have 

PATENT * TRADEMARK office full right to convey the entire interest 
* herein assigned, and that they have not 

executed and will not execute, any agreement 
in conflict herewith; and 

6) grant the firm of Irell & Manella the power 
to insert on this assignment any further 
identification which may be necessary or 
desirable in order to comply with the rules of 
the United States Patent Office for 

"ACTING COMMISSIONER, 0* recordation of this document. 

PATENTS ANDTRAOGMARK OFFICE 

This Assignment shall be binding upon our heirs, executors, 

administrators, and/or assigns, and shall inure to this benefit of ^ 

the heirs, executors, administrators, successors and/or assigns or & 

the Assignees. | 

In witness whereof, executed by the Assignors on the date(s) * 
indicated below. c t 
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Date : September 22, 1989 

Tatsuo Miyamura 



l^.SU<^— ■ Date: Sy*:«, '1*1 



Izumu Saito 



The above signatures were made in my presence, or afj™?* 1 ®^^ 
to me, by Tatsuo Miyamura and Izumu Saito, who are both known to 

me: 

Witness . PPx v~.-n ^ Date: Soptcmhor ??, I^W 

Name Hirotn Shimojo ° 

(Type or Print) 
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sequence of the polypeptide encoded in the extended ORF in the derived sequence. 

Fig. 27 shows the sequence of the HCV cDNA in clone 12f, the segment which overlaps clone 14i, and 
the amino acids encoded therein. 

Fig. 28 shows the sequence of the HCV cDNA in clone 35f, the segment which overlaps clone 39c, and 
s the amino acids encoded therein. 

Fig. 29 shows the sequence of the HCV cDNA in clone 19g, the segment which overlaps clone 35f, and 
the amino acids encoded therein. 

Fig. 30 shows the sequence of clone 26g, the segment which overlaps clone I9g, and the amino acids 
encoded therein. 

70 Fig. 31 shows the sequence of clone 15e, the segment which overlaps clone 26g, and the amino acids 
encoded therein. 

Fig. 32 shows the sequence in a composite cDNA, which was derived by aligning clones 12f through 
15e in the 5' to 3' direction; it also shows the amino acids encoded in the continuous ORF. 

Fig. 33 shows a photograph of Western blots of a fusion protein, SOD-NANB5-1-1 , with chimpanzee 
75 serum from chimpanzees infected with BB-NANB, HAV, and HBV. 

Fig. 34 shows a photograph of Western blots of a fusion protein, SOD-NANB5-1-1 , with serum from 
humans infected with NANBV, HAV, HBV, and from control humans. 

Fig. 35 is a map showing the significant features of the vector pAB24. 

Fig. 36 shows the putative amino acid sequence of the carboxy-terminus of the fusion polypeptide 
20 C1 00-3 and the nucleotide sequence encoding it. 

Fig. 37A is a photograph of a coomassie blue stained po ly aery lam ide gel which identifies C100-3 
expressed in yeast. 

Fig. 37B shows a Western blot of C100-3 with serum from a NANBV infected human. 

Fig. 38 shows an autoradiograph of a Northern blot of RNA isolated from the liver of a BB-NANBV 
25 infected chimpanzee, probed with BB-NANBV cDNA of clone 81 . 

Fig. 39 shows an autoradiograph of NANBV nucleic acid treated with RNase A or DNase I, and probed 
with BB-NANBV cDNA of clone 81. 

Fig. 40 shows an autoradiograph of nucleic acids extracted from NANBV particles captured from 
infected plasma with anti-NANBs-1-1 , and probed with ^P-labeled NANBV cDNA from clone 81. 
30 Fig. 41a and b shows autoradiographs of filters containing isolated NANBV nucleic acids, probed with 
^P-labeled plus and minus strand DNA probes derived from NANBV cDNA in clone 81 . 

Fig. 41-1 shows the homologies between a polypeptide encoded in HCV cDNA and an NS protein from 
Dengue flavivirus. 

Fig. 43 shows a histogram of the distribution of HCV infection in random samples, as determined by an 
35 ELISA screening. 

Fig. 44 shows a histogram of the distribution of HCV infection in random samples using two 
configurations of immunoglobulin-enzyme conjugate in an ELISA assay. 

Fig. 45 shows the sequences in a primer mix, derived from a conserved sequence in NS1 of 
flavi viruses. 

40 Fig. 46 shows the HCV cDNA sequence in clone k9-1 , the segment which overlaps the cDNA in Fig. 27, 
and the amino acids encoded therein. 

Fig. 47 shows the sequence in a composite CDNA which was derived by aligning clones k9-1 through 
15e in the 5* to 3' direction; it also shows the amino acids encoded in the continuous ORF. 

45 I. Definitions 

The term "hepatitis C virus" has been reserved by workers in the field for an heretofore unknown 
etiologic agent of NANBH. Accordingly, as used herein, "hepatitis C virus" (HCV) refers to an agent 
causitive of NANBH, which agent is a virus characterised by:(i) a positive stranded RNA genome; (ii) said 

so genome comprising an open reading frame (ORF) encoding a poly protein; and (iii) the portion of said 
poly protein corresponding to Figure 1 4 having at least 40% homology to the amino acid sequence in Figure 
14. This agent was formerly referred to as NANBV and/or BB-NANBV. The terms HCV, NANBV, and BB- 
NANBV are used interchangeably herein, but all refer to the virus as defined above. As an extension of this 
terminology, the disease caused by HCV, formerly Called NANB hepatitis (NANBH), is called hepatitis C. 

55 The terms NANBH and hepatitis C may be used interchangeably herein. 

The term "HCV", as used herein, denotes a viral species which causes NANBH, and attenuated strains 
or defective interfering particles derived therefrom. As shown infra., the HCV genome is comprised of RNA. 
It is known that RNA containing viruses have relatively high rates of spontaneous mutation, i.e., reportedly 
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FIG. 32" I COMBINED ORF OF DNAs 12f through 15e 

IlePheLysIleArgMetTyrValGlyGlyValGluHisArgl^uGluAlaAlaCysAsn 




TroThrAraGivGluAroCYsAspLeuGluAspArgAspArgSeiGluLeuSerProI«u 

61 ACTG^CXX^GGGGCGAACGTTCCGATCTGGAA^ 

TCACCTG^CCCGCTKK^CGCTAGACCTTCTGTCCCTGTC 

LeuLeuThrThrThrGlnTrpGlnValLeuPr^ 

121 TACTGCTGACCACTACACAGTGGCAGGTCCTCCCGTGTTCCT^ 

ATGAC^ACTGGTGATCTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGATGGTCGGA 

SerThrGlyl^uIleHisLeuHisGlnAsnlleValAspValGlnTyrl^u^rGiyval 
181 TGTCCACCGGCCTCATCC^CCTCCACCAGAACATTGTGGACGTGCAGTACTTCT^ 

ACAGGTGGCCGG AG TAGGTGGAGGTGGTCTTGTAACACCTGCACGTCATGAACATGCCCC 

GlySerSerIleAlaSerf*pAlaIleLysTipGluTyrValValLeuI«uPh^u^u 

241 TGGGGTCAAGCATCGCGTCCTGGGCCATTAAGTGGGAGTACGTCGTT^ 

ACCCCAGTTCGTAGCKCAGGACCCGGTAATTCACCCTCATGCAGCAAGAGGAaU^GGAAG 

LeuA laAspAlaArgVa icy sSerCysLeuTrpMetMetLeuI^uIleSerG InAlaGlu 
301 TGCTTGCAGACGCGCX5CGTCTGCTCCTGCTTGTGGATGATGCTACTCATATC 

ACGAACGTCTGC<3CG0GCAGACGAGGACGAACACCTACTACGATGAGTATAGGGTTCGCC 

MaAlaLeuGluAsnLeuVallleLeuAsnAlaMaSerl^uMaGlyThrHi^lyLeu 
361 AGGCGGCTTTGGAGAACCTCGTAATACTTAATGCAG(^TCCCTGGCCGGGACGCACTOTC 
TCCGCCGAAACCTCTTGGAGCATTATGAATTACGTCGTAGGGAC0GGCCCTGOGTGCCAG 

ValSerPhel^uValPhePheCysPheMaTrpTyrLeuLysGlyLysTrpValProGly 
421 TTGTATCCTTCCTCGTGTTCTTCTGCTTTGCA3XMTATTTGAAGGGTAAGTGGGTGCCCG 
AACATAGGAAGGAGCACAAGAAGACGAMCGTACCATAAACTTCCCATTCACCCACGGGC 

AlaValTyrThrPheTyrGlyMetTrpProLeuIieuI«uI«uI*uI«uAlaLeuProGln 

481 GAGCGGTCTACACCTTCTACGGGATGTGGCCTCTCC^ 

CTCGCCAGATGTGGAAGATGCCCTACACXX^AGAGGAGGAayVGGACAACXXX^CGGGG 

ArqAlaTyrMaLeuAspThrGluValAlaMaSer<^sGlyGlyValValI^uyalGly 
541 AGO^CGTACGCGCTGGACACGGAGGTGGCCGCGTC^ 
TCGCCCGCATGC«X»CCTGTGCCTCCACCGGCGCA^ 

I^uMetAlaLeuThrl^uSerProTyrTyrLysArgTyrlleSerTrpC^s^TrpTrp 
601 GGTTGATGGCGCTGACTCTGTCACCATATTACAAGCGCTATATCAGCTGGTGC 

CCAACTACCGCGACTGAGACAGTGGTATAATGTTCGCGATATAGTCGACCAOGAACACCA 

LeuGlnTyrPheLeuThrArgValGluAlaGlnLeuHisValTrpIleProProlAuAsn 
661 GGCTTCAGTATTTTCTGACCAGAGTGGAAGCGCAACTGCACGTGTGGATTCCCC 

CCGAAGTCATAAAAGACTGGTCTCACCTTCGCGTTGACGTGCACACCTAAGGGGGGGAGT 

ValArgGlyGlyArgAspAlaVadIleLeuLeuMetCysAlaValHisPr^hjrt«uVal 

721 ACGTCCX^GGGGGGCGCG ACGCCGTCATCTTACTCATG TGTGCTGTACACCCGACTCTGG 
TGCAGGCTCCCCCCGCGCTGCX^CAGTAGAATGAGTACACACGACATCTGGGCTIGAGACC 

PheAspIleThrLysLeuLeuLeuAlaValPheGlyProI^uTrpIl^^lnAlaSer 
781 TATTTGACATCACCAAATTGCTGCTGGCCGTCTTCGGACCCCTTTGGATTCTTCAMCCA 
ATAAACTGTAGTGGTTTAACGACX5ACCX3GCAGAAGCCTGGGGAAACCTAAGAAGTTCGGT 

LeuLeuLysValProTyrPheValArgValGlnGlyl^uLeuArgPheCysAlal^uAla 
841 GTTTGCTTAAAGTACCCTACTTTGTGOGCGTCCAAGGCCTTCTCCGGTTCTGOGCGOTAG 
CAAACGAATTTCATGGGATGAAACACGCGCAGGTTCCGGAAGAGGCCAAGACGCGCAATC 

ArgLysMetlleGlyGlyHlsTyrValGlnMetValllelleLys^uGlyAl^L^Thr 

901 CGCGGAAGATGATCC^AGGCCATTACGTGCAAATGGTCATCATTAAGTTAGGGGCGOTA 
GCGCCTTCTACTAGCCTCCGGTAATGCACGTTTACCAGTAGTAATTCAATCCCCGCGAAT 
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GlyThrTyrValTyrAsnHisLeuThrProLeuArgAspTrpAlaHisAsnGlyLeuArg 

961 CTCGCACCTATGTTTATAACCATCTCACTCCTCTTC 
GACCGTGGATAOUVATATTGGTAGAGTGAGGAGAA 

AspLeuAlaValAlaValGluProValValPheSerGlnMetGluThrLysIieuIleThr 

1021 GAGATCTGGCCGTGGCTGTAGAGCCAGTCGTC 

CTCTAGACCGGCACCGACATCTCGGTCAGCAGAAGAGGGTTT^ 

TrpGlyAlaAspThrAlaAlaCysGlyAspIlelleAsnGlyLeuProValSerAlaArg 
108 1 CGTGGGGGGCAGATACCGCCGCGTGCX^TGACATCATCAACXSGCTTGCCTGT^ 

GCACCCCCCGTCTATGGCGGCGCACGCCACTGTAGTAGTTGCCGAACGGACAAAGGCGGG 

ArgGlyArgGluIleLeuLeuGlyProAlaA^ 

114 1 GCAGGGGCCGGGAGATACTGCTCGGGCCAGCCGAT^AATGGTCTCCAAGGGGTGGAGGT 
CGTCCCCGGCCCTCTATGACGAGCXCGGTCGGCTACCTTACCAGAGGTTCCCCACCTCCA 

LeuAlaProIleTlurAlaTyrAlaGlnGlnThrArgGlyLeuLeuGlyCysIlelleThr 
1201 TGCTGGCGCCCATCACGGCGTACGCCCAGCAGACAAGGGGCCTCCTAGGGTGCATAATCA 
ACGACCGCGGGTAGTCCCGCATGCGGGTTC 

Serl^uThrGlyArgAspLysAsnGlnValGluGlyGluValGlnlleValSerThrAla 
1261 CCAGCCTAACTGGCCXX3GACAAAAACCAAGTGGAGGGTGAGGTCXIAGATTGTGTCA^ 
GGTCGGATTGACCGGCCCTGTTTTTGGTTCACCTCCCACTCCAGGTCTAA 

AlaGlnThrPheLeuAlaThrCysIleAsnGlyValCysTrpThrValTyrHisGlyAla 

1321 CTGCCCAAACCTTCCTGGCAACGTGCATCAATGGGGTC 

GACGGGTTTGGAAGGACCGTTGCACGTAGTTACCCCACACGACCTGACAGATGGTGCCCC 

GlyThrArgThrllcAlaSerProLysGlyProVallleGlnMetTyrThrAsnValAsp 
13 8 1 CCGGAACG AGGACCATCGCGTCACCCAAGGG TCCTGTCATCCAGATGTATACCAATGTAG 
GGCCTTGCTCCTGGTAGCGCAGTGGGTTCCCAGGACAGTAGGTCTACATATGG 

GlnAspLeuValGlyTrpProAlaProGlnGlySerArgSerl^uThrProCysThrCys 
1441 ACCAAGACCTTGTGGGCTGGCCCGCTCCGCAAGGTAGCOGCTCATTGAC^ 
TGGTTCTGGAACACCCGACCGGGCGAGGOTTTCCATCGGCGAGTAAC^ 

GlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallleProValArgArgArg 
1501 GCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGATC 

CGCCX5AGGAGCCTGGAAATGGACCAGTGCTCCGTG0GGCTACAGTAAGGGCA(^CGGCCG 

GlyAspSerArgGlySerLeuLeuSerProArgProIleSerTyrLeuLysGlySerSer 

1561 GGGGTGATAGCAGGGGCAGCCTGCTGTCGCCCCGGCCCATTTCOT 

CCCCACTATCGTCCCCGTCGGACGACAGCGGGGCCGGGTAAAGGATGAACTTTCCGAGGA 

GlyGlyProLeuLeuCysPrciAlaGlyHlsAlaValGlyllePheArgAlaAlaValCys 
1621 CGGGGGGTCCGCTGTTGTGCCCCGCGGGGCACGCCGTGGGCATATTTAGGGCCGCGG 

GCCCCCCAGGCG ACAACACGGGGCGCCCCG TGCGGCACCCGTAT AAATCCCGGCGCCACA 

ThrArgGlyValAlaliysAlaValAspPHelleProValGluAsnLeuGluThrThrMet 

1681 gcaco:gtggagtggctaaggcggtggactttatc^ 

cgtgggcacctcacogattccgccacctgaaatagggacacc^ 

ArgSerProValPheThrAspAsnSerSerProProValValProGlnSerPheGlnVal 

174 1 TGAGGTCCCOGGTGTTCAOGGATAACTCCTCTCCACCAGTAGTC 

ACTCCAGGGGCCACAAGTGCCTATTGAGGAGAGGTGGTCATCACGGGGTCTCGAAGGTTC 

AlaHisLeuHisAlaProThrGlySerGlyLysSerThrLysValProAlaAlaTyrAla 
1801 TGGCTCACCTCCATGCTCCXIACAGGCAGCGGCAAAAGCACCAAGGTCCCGGCTGCATATG 
ACCGAGTGGAGGTACGAGGGTGTCCXSTCGCCGTTTTTC 

AlaGlnGlyTyrLysValLeuValLeuAsnProSerValAlaAlaThrteuGlyPheGly 
1861 CAGCTCAGGGCTATAAGGTGCTAGTACTCAACCCCTCTGTTGCTGCAAC^ 
GTCGAGTCCCGATATTCCACGATCATGAGTTGGGGAGACAACGACGTTG 

AlaTyrMetSerLysAlaHisGlylleAspProAsnlleArgThrGlyValArgThrlle 

FIG. 32-2 
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1921 GTGCTT ACATG TCCAAGGCTCATGGGATCG ATCCTAACATCAGGACCGGGG TGAGAACW. 
CACGAATGTACAGGTTCCGAGTACCCTAGCTAGGATTGT^ 

ThrThrGlySerProIleThrTyrSerThrTyrt 
1981 TTACCACTGGCAGCCCCATCACGTACTCCA(XTACGGCAAGTTCCTTC 

AATGGTGACCGTCGGGGTAGTGCATGAGGTGGATGCCGTTCAAGGAACGGCT 

SerGlyGlyAlaTyrAspIlellelleCysAspGluCysHisSerThrAspAlaThrSer 
204 1 GCTCGGGGGGCGCTTATGACATAATAATTTGTGACGAGTGCC 
CGAGCCCCCCGCGAATACTGTATTATTAAACACTGCTC^ 

IleteuGlylleGlyThi^alLeuAspGlnMaGlu^ 
2101 CCATCTTGGGCATCGGCACEGTCCTTGACCAAGCAGA^ 

GGTAGAACCCGTAGCCGTGACAGGAACTGG iTCSTCTCTGACGCCCCCGCTCTGACCAAC 

LeuAlaThrAlaThrProProGlySerValThrValProHisProAsnlleGluGluVal 
2161 TGCTCGCCACCGCCACCCCTCCGGGCTCCGTCACTGTC 

AOGAGCGGTGGOGGTGGGGAGGCCOGAGGCAGTGACACGGGGTAGGGTTC 

Alal^uSerThrThrGlyGluIleProPheTyrGlyLysAlalleProLeuGluVallle 
2221 TTGCTCTCTCCACCACCGGAGXGATCCCTTTTTAOGGCAA 

AACGAGACAGGTGGTGGCCTCTCTAGGGAAAAATGCCGTTCCGATAGGGGGAGCTTCATT 

LysGlyGlyArgHisLeullePheCysHisSerLysLysLysCysAspGluLeuAlaAla 

2281 TCAAGGGGGGGAGACATCTCATCTTCTGTCATTCAAAGA^ 

AGTTCCCCCCCTCTGTAGAGTAGAAGACAG TAAGTTTCTTCTTCACGCTGCTTG AGCGGC 

LysLeuValAlaLeuGlylleAsnAlaValAlaTyrTyrArgGlyLexiAspValSerVal 
2341 CAAAGCTGGTOGCATTGGGCATCAATGCCGTCGCCTACTACOGCGGTCTTG^ 
GTTTCGACCAGCGTAACGCGTAGTTAOGGCAGCGGATGAIGGG^ 

IleProThrSerGlyAspValValValValAlaThrAspMa^ 
2401 TCATCCCSACCAGCGGCGATGTTCTCGra 

AGTAGGGCTGGTOGCCGCTACAACAGCAGCACOGTTGGCT 

GlyAspPheAspSexVallleAspCysAsnThrCy^ 
2461 CCGGCGACTTCGACTCGGTGATAGACTGCAATACGTGTGTC 
GGOO^TGAAGCTGAGCCACTATCTGA^ 

LeuAspProThrPheThrlleGluThrlle^ 
2521 GOCTTGACCCTACCTTCACCATTGAGACAATCAOGCTCCCCCAGGATGCT 
CGGAACTGGGATGGAAGTGGTAACTCTGTTAGTGOGAGGGGGTCCT 

GlnArgArgGlyArgThrGlyArgGlyLysProGlylleT^^ 
2581 CTCAACGTCGGGGCAGGACTGGCAGGGGGAAGCCAGGCATCTA 
GAGTTGCAGCCCCGTCCTGACCGTCC^^ 

GluArgPrc^erGlyMetPheAspSerSerValLeuCysGluCysTyrAspAlaGlyCys 
2641 GGGAGCGCCCCTCCGGCATGTTOGACTOGTCOGTCCTCTGTGAGTGCT 
CCCTCG(X3GGGAGGCCGTACAAGCTGAGCAGGCAGGAGAC^ 

MaTrpTyrGluLeuThrProAlaGluThrtlurValArgLeiiArgM 
2701 GTGCTTGGTATGAGCTCACGTCC^ 

CACGAACCATACTCGAGTCOGGGOGGCTCTGATG^ 

ProGlyLeuProValCysGlnAspHisI^uGluPheTrpGluGlyValPheThr<31yLeu 
2761 CCCCGGGGCTTCCCGTCTG<XAGGA<XATC 

GGGGCCCCGAAGGGCACAOGGTCCTGGTAGAACTTAAAACCCTCCCGCAGAAATGTCOGG 

ThrHislleAspAlaHisPheLeuSerGlnThrL^^ 
2821 TCACTCATATAGATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGGGAG 
AGTGAG TATATCTACGGGTGAAAGATAGGGTCTGTTTCGTCTCACCCCTO 

LeuValMaTyrGlnAlaThrValCysAlaArgAlaGlnAlaPro 
2881 ACCTGGTAGCGTACCAAGCCACOGTGTGCGCTAGGGCTCAA 

TGGACCATCGCATGGTTCGGTGGCACACX^GATCCOGAGTTCGGGGAGG 

FIG. 32-3 
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GlnMetTrpLysCyaLeuIleArgLeuLysProThrLeuHisGlyProThrProLeuLeu 
294 1 ACCAGATGTGGAAGTGTTTGATTCGCCTCAAGCCCACCCTC 

TGGTCTACACCTTCACAAACTAAGOGGAGTTCGGGTGGGAGGTAC^ 

TyrArgLeuGlyAlaValGlnAsnGluIleThrLeuThrHisProValThrLysTyrlle 
3001 TATACAGACTGGGCGCTGTTCAGAATGAAATCACCCTGACGC^ 
ATATGTCTX3ACCCGaACAAGTCTTACTTTAGTGGGAC 

MetThrCysMetSerMaAspLeuGluValValThrSerThrTrpVall-euValGlyGly 

306 1 TCMGACATGCATGTCGGCCGAC^ 

AGTACTGTACGTACAGCCGGCTGGACCTCC^^ 

ValLeuAlaMal^uAlaMaTyrCysLeuSerThrGlyCysValVallleValGlyArg 

3121 GCGTCCTGGCTGCTTTGGCCGCGTATTGCCTGTCA 

CGC^GGACOTACGAAACCGGOGCATAACGGACAGTTGTCCGACGCACCAGTATCACCCGT 

ValValLeuSerGlyLysProAlallelleProAspArgGluValLeuTyrArgGluPhe 

3181 gggto;tcttgtcogggaagccggc^ 

cccagcagaacaggcccttcggcxx5ttagtatgga 

AspGluMetGluGluCysSerGlnHisI^uPro^ 
3241 TCGATGAGATGGAAGAGTGCTCTCAGCACTTACCX3TACATCGAGCAAGGGA1^ 

AGCTACTCTACCTTCTCACGAG AGTCGTG AATGGCATG TAGCTCGTTCCCTACTACGAGC 

GluGlnPheLysGlnLysAlaLeuGlyLeuLeuGlnThrAlaSerArgGlnAlaGluVal 
3301 CCGAGCAGTTCAAGCAGAAGGCCCTOGGCCTCCTGCAGACCGCGTCCCGTCAGGCAGAGG 
GGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAGGACGTC 

IleAlaProAlaVaaGlnThrAsnTrpGlnLysLeuGluThrPheTrpAlaLysHisMet 
3361 TTATCGCCCCTGCTGTCCAGACCAACTGGCAAAAACTCGAGACCTTCTGGGCGAAGCATA 
AAT AGCGGGG ACG ACAGGTCTGGTTGACCG TTTTTG AGCTCTGG AAG ACCCGCTTCGTAT 

TrpAsnPhelleSerGlylleGlnTyrLeuAlaGlyl^uSerThrLeuProGlyAsnPro 

3421 TGTGGAACTTCATCAGTGGGATACAATACTTGGCGG 

ACACCTTGAAGTAGTCACCCTATGOTATGAACOGCCCGAACAGTTGCGAOGGACCATTTC 

AlalleAlaSerLeuMetAlaPheThrAlaAlaValThrSerPrc^uThrThrSeiGln 
3481 COGCCATTGCTTCATTGATGGCTTOTACAGCTGCTGTCACCAGCCCACT 

GGCGGT AACG AAGT AACTACCG AAAATGTCG ACG ACAGTGGTCGGG TG ATTGGTG ATCGG 

ThrLeuLeuPheAsnlleLeuGlyGlyTrpVal^ 
3541 AAACCCTCCTCTTCAACATATTGGGGGGGTGGGTGGCTGCCCAGCTCGCOGC^ 

TTTGGGAGGAGAAGTTGTATAACCCCCCC^CCCACCGACGGGTCX»AGOGGOGGGGGCCAC 

AlaThrAlaPheValGlyAlaGlyLeuAlaGlyAlaAlalleGlySerValGlyLeuGly 

3601 CCGCTACTGCCTTTGTGGGCGCTGGCTT^ 

GGCGATGACGGAAACACOCGOGACCGAATOSA^ 

LysValLeuIleAspIleLeuAlaGlyTyrGlyAlaGlyValAlaGlyAlaLeuValAla 
3661 GGAAGGTCCTCATAGACATCCTTGCAGGGTATGGCX3CXK3GOT 
CCTTCXaGGAGTATCTGTAGGA^ 

PheLysIleMetSej^lyGluValProSerThrGluAspLeuValAsnLeuIieuProAla 
3721 C^TTCAAGATCATGAGOGGTGAGGTOCCCTCCACGGAGGACCTGGTCAATCTACTC 
GTAAGTTCTAGTACTCGCCACTCCAGGGGATC 

IleLeuSerProGlyAlaLeuValValGlyValValCysAlaAlalleLeuArgArgHis 

3781 ccatcctctcgcccggagccctogtagtcggcgtggtctgtgcagcaatactgcgccggc 
ggtaggagagcgggcctcgggagcatcagccs^^ 

ValGlyProGlyGluGlyAlaValGlnTrpMetAsnArgLeuIleAlaPheAlaSerArg 
3841 ACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGGATGAACCXjGCTGATAGC 

TGCAACC^GCCCGCTCCCCCGTCACGTCACCTACTTGGCCGACTATCGGAAGCGGAGGG 

GlyAsnHisValSerProThrHisTyrValProGluSerAspAlaAlaAlaArgValThr 

FIG. 32-4 
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3901 GGGGGAACCATGTTTCCCCCACGCACTACGTGCCGGAGAGCGATGCAGCTGCCCGCGTCA 
CCCCCTTGGTACAAAGGGGGTGCGTGATG<^CGGC^ 

AlalleLeuSerSerLeuThrValThrGlnLeuLeuArgArgLeuHisGlnTrpIleSer 

3961 CTGCCATACTCAGCAGCCTCAC1OTAACCCAGCTCCTGAGGCGAC 
GACGGTATGAGTCGTCGGAGTGACATTGGGTCGA 

SerGluCysThrThrProCysSexGlySerTrpLeuArgAspIleTrpAspTrpIle^ 
4021 GCTC^AGTGTACCACTCCATGCTCCGGTTCCTGGCT 
CGAGCCTCACATGGTGAGGTACGAGGCCAAGGACC^ 

GluValLeuSerAspPheLysThrTrpLeuLyaAlaLysLeuMetProGlnl^uProGly 
4081 GCGAGGTGTTGAGCGACTTTAAGACCTGGCTAAAAGCTAAGC^ 
CGCTCCACAACTCGCTGAAATTCTGGACCGATTTTC^ 

IleProPheValSerCysGlnArgGlyTyrLysGlyValTrpArgValAspGlylleMet 

4141 GGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT^^ 
(X^AGGGGAAACACAGGACGGTCGCGCCCATATTCC^ 

HisThrArgC^sHisCysGlyMaGluIleThrGlyHisValLysAsnGlyThrffetA^ 
4201 TGCACACTCGCTGCCACTGTGGAGCTGAGATCACTGGAC^ 

ACGTGTGAGCGACGGTGACACCTCGACTCTAGTGACCTGTACAGTTTO 

IleValGlyProArgThi^sArgAsnMetTrpSexGlyThrPheProIleAsnAlaTyr 
4261 GGATCGTOGGTCCTAGGAC£TGCAGGAACATGTGGAGTGGGACC 

CCTAGCAGCCAGGATCC1X3GACGTCCTTG TACACCTCACCCTGGAAGGGG TAATTACGGA 

ThrThrGlyProCysThrProLeuProAlaProAsnTyrThrPheMateuTrpArgVal 
4321 ACACCACGGGCCCCTGTACCCCCCTTCCTC 

TGTGGTGCCCGGGGACATGGGGGGAAGGACG<X5GCTTCATC 

SerAlaGluGluTyrValGluIleArgGlnValGlyAspPheHisTyrValThrGlyMet 
4381 TGTCTGCAGAGGAATATCTGGAGATAAGGCAGGTGGGGGACTTCC^ 

ACAGACGTCTCCTTATACACCTCTATTCCGTCCACCCCCTGAAGGTGATGCACT 

ThrThrAspAsnLeuLysCysProCysGlnValProSerProGluPhePheThrGluLeu 
4441 TGACTACTGACAATCTCAAATGOCOGIX^CAGGTC<XATC 

ACTGATGACTGTTAGAGTTTACGGGCACGGTCCAGGGTAGCGGGCTTAAAAAGTG 

AspGlyValArgl^uHisArgPheAlaProProCysLysProLeuLeuArgGluGluVal 
4501 TGGACGGGGTGCGCCTACATAGGTTTGCGCCCCCXITGCAAGCCCTTGCTG 
ACCTGCCCCACGCGGATGTATCCAAACGCGGGGGGACGTTCGGGAATO 

SexPheArgValGlyLeuHisGluTyrProValGlySerGlnLeuProCysGluProGlu 
4561 TATCATTCAGAGTAGGACTCCACGAATACCCGGTAGGGTCG 
ATAGTAAGTCTCATCCTGAGGTGCTTATGGGCCATCCCAGC^ 

ProAspValAlaValLeuThrSexMetLeuThrAspProSerHisIleThrAlaGluAla 
4621 AACCGGACGTGGCCGTGTTGACGTCCATGCTCACTGATC 

TTGGCCTGCACCGGCACAACTGCAGGTACGAGTGACTAGGGAGOT 

MaGlyArgArgLeuMaArgGlySerProProSerValAlaSerSerSerAlaSerGln 
4681 CGGCCGGGO^AGGTTGGOGAGGGGATCACCCCCCTCTC 
GCCGGCCCGCTTCCAACCGCTCCCC^^ 

LeuSexAlaProSerLeuLysMaThrCysThrM 
4741 AGCTATCCGCTCCATCTCTCAAGGCAACTTGCACCGCTAA 
TCGATAGGCGAGGTAGAGAGTTCCGTTCAArc 

LeuIleGluMaAsnLeuLeuTrpArgGlnGluMetGlyGlyAsnlleThrArgValGlu 
4801 AGCTCATAGAGGOCAACCTCCTATGGAGGCAGGAGATGGGCGGCAACATCACCAGGG 

TCGAGTATCTCCGG TTGGAGGATACCTCCG TCCTCTACCCGCCGTTGTAG TGG TCCCAAC 

SerGluAsnLysValVallleLeuAspSerPheAspProLeuValAlaGluGluAspGlu 

4861 AGTCAGAAAACAAAGTGGTGATTCTGGACTCCTTCGATCCGCTTGTGGCGGAGGAGGACG 
TCAGTCTTTTGTTTCACCACTAAGACCTCAGGAAGCTAGGCGAACACCGCCTC 
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ArgGluIleSerValProAlaGluIleLeuArgLysSerArgArgPheAlaGlnAlaLeu 
4921 AGCGGGAGATCTCCGTACCCGCAGAAATCCTGCGGAAGTCTCGGAGATTCGCCCAGGCCC 
TCGCCCTCTAGAGGCATGGGCGTCTTTAGGACGCCTTCAGAGCCTCTAAGCGGGTCC^ 

ProValTrpAlaArgProAspTyrAsnProProLeuValGluThrTrpLysLysProAsp 
4981 TGCCCGTTTGGGCGCGGCCGGACTATAACCCCCCGCTAGTGGAGACGTGGAAAAAGCCTO 
ACGGGCAAACCCGCGCCGGCCTGATATTGGGGG^ 

TyrGluProProValValHisGlyCysProLeuProProProLysSerProProValPro 
504 1 ACTACGAACCACCTGTGGTCCATGGCTGTCCQCTTCCACCTCCAAAGTCCCCTCCTGTGC 
TCATGCTTGGTGGACACCAGGTACCGACAGGCGAAGGTGGAGGTTT^ 

ProPr °ArgLysLysArgThiValValI^ 
5101 CTCCGCCTCGGAAGAAGCGGACGGTGGTCCTCACTGAATCAACCCTATCTACTC 
GAGGCGGAGCCTTC1TCGCCTGCCACCAGGAGTGACTTAGTTGGGATAGATGACG 

GluLeuAlaThrArgSerPheGlySerSerSerThrSerGlylleThxGlyAspAsnThr 
516 1 CCG AGC TCGCCACCAG AAGCTTTGGCAGCTCCTCAACTTCCGGCATTA 

GGCTCGAGCGGTGGTCTTCGAAACCGTCGAGGAGTTGAAGGCCGTAATGCCCGCTG 

c ^ T ^^SerSerGluPraAlaProSerGlyCysProProAspSerAspAlaGluSerTyr 
5221 CGACAACATCCTCTGAGCCCGCCCCTTCTGGCTGCCCCCCCGACTCCGACGCTGAGTCCT 
GCTGTTGTAGGAGACTCGGGCGGGGAAGACCGACGGGGGGGCTCAGGCTGC^ 

SerSerMetProProLeuGluGlyGluPrc5GlyAspProAspLeuSerAspGlySerTrp 
5281 ATTCCTCCATGCCCCCCCTGGAGGGGGAGCCTGGGGATCCX3GATCTTAGCGACGGGTCAT 
TAAGGAGGTACGGGGGGGACCTCCCCCTCGGACCCCTAGGCCTAGAATCGCTGCCCAGTA 

c , , , „ SerThrValSerSerGluAlaAsnAlaGluAspValValCysCy sSerMetSerTyrSer 
5341 GGTCAACGGTCAGTAGTGAGGCCAACGCGGAGGATGTCGTGTGCTGCTCAATC 
C(^GTTGCCAGTCATCACTCCGGTTGCGCCTCCTACAGCA 

cxni ^^ET^ 1 y^ a ^ uVal ^ p roCysAlaAlaGluGluGlnLysI^uProIleAsnAl^ 
5401 CTTGGACAGGCGCACTCGTCACCCCGTGOTCCGCGGAAGAACAGAA^ 

GAACCTGTCOGCGTGAGCAGTGGGGCACGCGGCGCCTTCTTCTC 

^^^^® rAsn ^^^uArgHisHisAsnLeuValTyrSerThrThrSerArgSerAl 
5461 CACTAAGCAACTCG TTGCTACG TCACCACAATTTGGTGTATTCCACCACCTCACGCAGTG 
GTGATTCGTTGAGCAACGATGCAGTGGTGTTAAACCACATAAGGTC^ 

5521 

C(rrt , AspVall^uLysGluValLysAlaAlaAlaSerLysValLysAlaAsnLeuLeuSerVal 
5581 AGGACGTACTCAAGGAGGTTAAAGCAGCXX5(^TCAAMGTGAAGGCTAACTTGCTATC 
TCCTGCATGAGTTCCTCCAATTTCGTCGCCGC^ 

GluGluAla(^sSerI^uThrProProHisSerAlaLy.sSerLysPheGlyTyrGlyAla 
5641 TAGAGGAAGCTTGCAGCCTGACGCCCCCACACTCAGCCAAATCC^ 

ATCTCCTTOGAAOGTCGGACTGOGGGGGTGTCAGTOGGTTTAGGTTC^ 

e-,«, L y^ s P v alArgCysHisAlaArgLysAlaValThrHisIleAsnSerValTrpLysAsp 
5/01 CAAJ^AOGTCCGT^ 

GTTTTC TGCAGGCAACGGTACX^TCTTTCCGGCATTGG^ 

LeuLeuGluAspAsnValThrProIleAspThrThrlleMetAlaLysAsnGluValPhe 
5761 ACCTTCTGGAAGACAATGTAACACCAATAGACACTACCATCATGGCTAAGAACGAGGTTT 
TGGAAGACCTTCTGTTACATTGTGGTTATCTGTGATGGTAGTACCGATTCTTGC 

CysValGlnProGluLysGlyGlyArgLysProAlaArgLeuIleValPheProAspLeu 
5821 TCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAGCCAQCTCGTC^ 

AGACGCAAGTCGGACTCTTCCCCCCAGCATTCGGTCGAGCAGAGTAGCACAAGGGGCTAG 
GlyValArgValCysGluLysMetAlaLeuTyrAspValValThrLysLeuProLeuAla 
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5881 TGGGCGTGCGCGTGTGCGAAAAGATGGCTTTGTACGA^ 

ACCCGCACGOGCACACGCTTTTCTACCGAAACATCCTGCACCAATGTTTCG 

ValMetGlySerSerTyrGlyPheGlnTyrSerProGlyGlnArgValGluPheLeuVal 
594 1 CCGTGATGGGAAGCTCCTACGGATTCCAATACTCACCAGGACAGCGGGTTC 
GGCACTACCCTTCGAGGATGCCTAAGGT^^ 

GlnAlaTrpLysSerLysLysThrProMetGlyPheSerT^ 
6001 TGCAAGCGTGGAAGTCCAAGAAAACCCCAATGGGGTTC 
ACGTTCGCACCTTCAGGTTCTTT^^ 

SerThrValThrGluSerAspIleArgT^ 
6061 ACTCCAC^GTCACTGAGAGCGACATCOGTACGGAGGAGGCAATCTACCAATGTTGTGA^ 

TGAGGTGTCAGTGACTCTCGCTGTAGGCATGCCTCCTCCGTO 

AspProGlnMaArgValAlalleLysSerLeuThrGluArgLeuTyrValGlyGlyPTO 
6121 TCGACCCCCAAGCCCGCGTGGCCATCAAGTCCCTCACCGAGAGGCTTTA1X3TTGGGGGCC 
AGCTGGGGGTTCGGGCGCACCGGTAGTTCAGGGAGTGGCT 

LeuThrAsnSerArgGlyGluAsnCysGljrTyrArgArgQrsArgAlaSerGlyValLeu 
6181 CTCTTACCAATTCAAGGGGGGAGAACTGCGGCTATCGCAGGTGCOT 
GAGAATGGTTAAGTTCCCCCCTCTTGACGCCGATAGCGTCCA 

ThrThrSer<^sGlyAsnThrLeuThrCysTyrIleLysAlaArgAlaAlaCysArgAla 

6241 TGACAACTAGCTGTGGTAACACCCTCACTTGCTACATCAAGGCCCGGGCAGCCTGTCGAG 
ACTGTTGATCX^CACCATTGTGGGAGTGAACGATGTAGTTCCGGGCCCGTC^ 

AlaGlyLeuGlnAspCysThrMetLeuValCysGlyAspAspLeuValVallleCysGlu 
6301 CCGCAGGGCTCCAGGACTGCACCATGCTOGTGTGTGGCGACGACOT 

GGCGTCCCG AGG TCCTGACGTGGTACGAGCACACACCGCTGCTGAATCAGC^^ 

SerAlaGlyValGlnGluAspAlaMaSerLeuArgAlaPheThrGluAlaMetThrArg 
6361 AAAGCGCGGGGGTCCAGGAGGACGCGGCGAGCCTGAGAGCCTTCACGGAGGCTATGACCA 
TTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCGGACTCTCGGAAGTC 

TyrSerAlaProProGlyAspProProGlnProGluTyrAspLeuGluLeuIleThrSer 
6421 GGTACTCCGCCCCCCCTGGGGACCCCCCACAACCAGAATACGACTTGGAGCTCATAACAT 
CCATGAGGCGGGGGGGACCCCTGGGGGGTGTTGGTCTTATG 

CysSerSerAsnValSerValAlaHisAspGlyAlaGlyLysArgValTyrtTyrl^uTte 
6481 CATGCTCCTCCAACG1GTCAGTCGCCCACGACGGOGCTGGAAAGAGGGTCT 
GTAQ3AGGAGGTTGCACAGTCAG0GGGTGCTGCCGCGAC^ 

Ar^AspProThrThrProLeuAlaArgAlaMaTrpGluThrAlaArgHisThrProVal 

6541 CCCGTGACCCTACAArcCCCCTCGCG^ 

GGGCACTGGGATGTTGGGGGGAGOGCTCTCX^CGCACCCTC 

AsnSerTrpLeuGlyAsnllelleMetPheAlaProThrLeuTrpAlaArgMetlleLeu 

6601 TCAATTCCTGGCTAGGCAACATAATOVTGT 

AGTTAAGGACCGATCOGTTGTATTAGTACAAACGGGGGTGTGACACCCGCTCCTACTATG 

MetThrHisPhePheSerValLeuIleAlaArgAspGlnljeuGluGlnAlaLeuAspCys 
6661 TCATGACCCATTTCTTTAGCGTCCTTATAGCCAGGGACCAGCTTGAACAGGCCCTCGATT 
ACTACTGGGTAAAGAAATCGCAGGAATATCGGTCCCTGGTOGAAC^ 

GluIleTyrGlyAlaCysTyrSerlleGluPrdLeuAspLeuProProIlelleGlnArg 
6721 GCGAGATCTACGGGGCCTGCTACTCCATAGAACCACTTGATCTACCTCCAATCATTCAAA 
CGCTCTAGATGCCCCGGACGATGAGGTATCTTGGTGAACTAGATGGAGGTTAGTAAGTTT 

Leu 

6781 GACTC 
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FIG 4 7- 1 c 01 " 13 ™^ 0RF 0F DNAs K9 " 1 throu ^ h 15e 

GlyCysProGluArgLeuAlaSerCysArgProLeuThrAspPheAspGlnGlyTrpGly 
1 CAGGCTGTCCTGAGAGGCTAGCCAGCTGCCGACCCCTTACCGATTTTGACCAGGGCTGGG 
GTCCGAC^iMACTCTCCGATCGGTCGACGGCTGGGGAATGGCTAAAACTGGTCCCGACCC 

ProIleSerTyrAlaAsnGlySerGlyProAspGlnArgProTyrCysTrpHisTyrPro 
61 GCCCTATCAGTTATGCCAACGGAAGCGGCCCCGACCAGCGCCCCTACTGCTGGCACTACC 
CGGGATAGTCAATACGGTTGCCTTCGCCGGGGCTGGTCGCGGGGATGACGACCGTGATGG 

ProLysProCysGlylleValProAlaLysSerValCysGlyProValTyrCysPheThr 
121 CCCCAAAACCTTGCGGTATTGTGCCCGCGAAGAGTCTCTGTGGTCCGGTATATTGCTT^ 
GGGGTTTTGGAACGCCATAACACGGGCGCTTCTCACACACACCAGGCCATATAACGAAGT 

ProSerProValValValGlyThrThrAspArgSerGlyMaProThrTyrSerttpGly 
181 CTCCCAGCCCCGTGGTGGTGGGAACGACCGACAGGTCGGGCGCGCCCACCTACAGCTGGG 
GAGGGTCGGGGCACCACCACCCTTGCTGGCTGTCCAGCCCGCGCGGGTGGATGTCGACCC 

GluAsnAspThrAspValPheValLeuAsnAsnThrArgProProLeuGlyAsnTrpPhe 
241 GTGAAAATGATACGGACGTCTTCGTCCTTAACAATACCAGGCCACCGCTGGGCAATTGGT 
CACTTTTACTATGCCTGCAGAAGCAGGAATTGTTATGGTCCGGTGGCGACCCGTTAACCA 

GlyCysThrTrpMetAsnSerThrGlyPheThrLysValCysGlyAlaProProCysVal 
301 TCGGTTGTACCTGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCTCCTTGTG 
AGCCAACATGGACCTACTTGAGTTGACCTAAGTGGTTTCACACGCCTCGCGGAGGAACAC 

IleGlyGlyAlaGlyAsnAsnThrLeuHisCysProThrAspCysPheArgLysHisPro 
361 TCATCGGAGGGGCGGGCAACAACACCCTGCACTGCCCCACTGATTGCTTCCGCAAGCATC 
AGTAGCCTCCCCGCCCGTTGTTGTGGGACGTGACGGGGTGACTAACGAAGGCGTTCGTAG 

AspAlaThrTyrSerArgCysGlySerGlyProTrpIleThrProArgCysLeuValAsp 
421 CGGACGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATCACACCCAGGTGCCTGGTCG 
GCCTGCGGTGTATGAGAGCCACGCCGAGGCCAGGGACCTAGTGTGGGTCCACGGACCAGC 

TyrProTyrArgLeuTrpHisTyrProCysThrlleAsnTyrThrllePheLysIleArg 
481 ACTACCCGTATAGGCTTTGGCATTATCCTTGTACCATCAACTACACCATATTTAAAATCA 
TGATGGGCATATCCGAAACCGTAATAGGAACATGGTAGTTGATGTGGTATAAATTTTAGT 

MetTyrValGlyGlyValGluHisArgLeuGluAlaAlaCysAsnTrpThrArgGlyGlu 
54 1 GGATGTACGTGGGAGGGGTCGAACACAGGCTGGAAGCTGCCTGCAACTGGACGCGGGGCG 
CCTACATGCACCCTCCCCAGCTTGTGTCCGACCTTCGACGGACGTTGACCTGCGCCCCGC 

ArgCysAspLeuGluAspArgAspArgSerGluLeuSerProLe 
601 AACGTTGCGATCTGGAAGACAGGGACAGGTCCGAGCTCAGCCCGTTACTGCTGACCACTA 
TTGCAACGCTAGACCTTCTGTCCCTGTCCAGGCTCGAGTCGGGCAATGACGACTGGTGAT 

GlnTrpGlnValLeuProCysSerPheThrThrl^uProAlaLeuSerThrGlyLeuIle 

661 CACAG1^CAGGTCCTCCCX5TGTTCCTTCACAACCCTACCAGCC^ 

GTGTCACCGTCCAGGAGGGCACAAGGAAGTGTTGGGATGGTCGGAACAGGTGG 

HisLeuHisGlnAsnlleValAspValGlnTyrLeuTyrGlyValGlySerSerlleAla 

721 TCCACCTCCACCAG AACATTGTGG ACGTGCAGTACTTG TACGGGGTGGGGTCAAGCATCG 
AGGTGGAGGTGGTCTTGTAACACCTGCACGTCATGAACATGCCCCACCCCAGTTCGTAG^ 

1 * ■ 

SerTrpMalleLysTrpGluTyrValValLeuLeuPheLeuI^uLeuAlaAspAlaArg 
781 CGTCCTGGGCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTTCTCCTTGCAGACGCGC 
GCAGGACCCG6TAATTCACCCTCATGCA^ 

ValCysSerCysLeuTrpMetMet^ 

841 gcgtctgctcctgcttgtggatcatcctXC 

: * CGCAGACGA&ACGA^ 

LeuVal I l^LeuAsnAlaAl aSerLeuAlaG iyThrHisGlyieuValSerPheLeuVal 
901 ACCTCGTAATACTTAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGTATCCTTCCTCG 
TGGAGCATTATGAATTACGTCGTAGGGACCGGCCCTGCGTGCCAGAACATAGGAAGGAGC 
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A^TGTC^TC^CCGGCGCAGCACACCGCCACMCAAGAGCAGCCCAACTACCGCGACT 
AsoAlaValllel^uLeuMetCysAlaValHisProThrl^uValPheAspIleTteLys 

1261 ^SS^SSSSSSSSSSSSSSSSSSSSSS 

I*uI*uLeuAlaValPheGlyP«^uTrpIle^ 
1321 AATTCCTGCTGGCCGTCTTCGGACCCCTTTGGATTCTTCAAGCCAGTTTGCTTAAMTJ^ 

tcaacgmga^cm^ 

TvrPheValAraValGlnGlyLeuLeuArgPheCysAlaLeuAlaArgLysMetlleGly 
1 1 ft 1 rrTACTTTCTGCGCGTCCAAGGCCTTCTCCGGTTCTGCGCGTTAGCGCGGAAGATGATCG 

GlyHisTyrValGlnMetValllelleLysLeuGlyAlal/wTh^ 

1441 gaggcca^acgtgcaaatggtcatcattaagttaggggcgcttact^ 

^TCCGGTAATCCACGTOTACCAGTAGTAATTCAATCCCCGCGAATGACCGTGGATACAAA 
AsnHisI^uThrProI^uArgAspTrpAlaHisAsnGlyl^uArgAspI^uAlaValAla 

1501 SSSSSK^ 

ValGluProValValPheSerGlnMetGluThrLysI^uIleThrTrpGlyAlaAspThr 

1561 CTGTAGAGCCAGTCGTCTKTTCCC^TGGAGACCAAGC 

GACATCTCGGTCAGCAGAA6AGGGTTTACCTCTGGTTCGAGTAGTGCACCCCCCGTCTAT 

AlaAlaCvsGlvAspIlelleAsnGlyLeuProValSerAlaArgArgGlyArgGluIle 

1621 c ^^d^^s^^^^^ 
1681 TA ™« 

SgSS^cggct^ 

, AlaTyrAlaGlnGlnThrArgGlyLeuI^uGlyCysIlel^^ 

1741 cggwtacgcccagcagacaaggggcctcctagggtgca^ 

®SCMGCQGGTCOTCTpTK000B6^ 

AsDLvsAsnGlnValGluGlyGluValGlnlleValSerllttAlaAlaGlnTtePhg^ 

AlaSerProLysGlyProVallieGlnMetTyrThrAsnValAspGlnAspLeuValGly 
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1921 TCGCGTCACCC AAGGGTCCTGTCATCCAG ATGTATACC AATGTAG ACCAAG ACCTTGTGG 
AGCGGAGTGGGTTCCCAGGACAGTAGGTCTACATATGGTTACATCTGGTTCTGGAACACC 

TrpProAlaProGlnGlySerArgSerLeuThrProCysThrCysGlySerSerAspLeu 
1981 GCTGGCCCGCTCCGCAAGGTAGCCGCTCATTGACACCCTGCACTTGCGGCTCCTCGGACC 
CGACCGGGCGAGGCGTTCCATCGGCGAGTAACTGTGGGACGTGAACGCCGAGGAGCCTGG 

TyrLeuValThrArgHisAlaAspVallleProValArgArgArgGlyAspSerArgGly 
204 1 TTTACCTGGTCACGAGGCACGCCGATGTCATTCCCGTGCGCCGGCGGGGTGATAGCAGGG 
AAATGGACCAGTGCTCCGTGCGGCTACAGTAAGGGCACGCGGCCGCCCCACTATCGTCCC 

Serl^uLeuSerProArgProIleSerTyrLeuLysGlySerSerGlyGlyProLeuLeu 
2101 GCAGCCTGCTGTCGCCCCGGCCCATTTCCTACTTGAAAGGCTCCTCGGGGGGTCCGCTGT 
CGTCGGACGACAGCGGGGCCGGGTAAAGGATGAACTTTCCGAGGAGCCCCCCAGGCGACA 

CysProAlaGlyHisAlaValGlyllePheArgAlaMaValCysThrArgGlyValAl 
2161 TGTGCCCCGCGGGGCACGCCGTGGGCATATTTAGGGCCGCGGTGTGCACCCGTGGAGTGG 
ACACGGGGCGCCCCGTGCGGCACCCGTATAAATCCCGGCGCCACACGTGGGCACCTCACC 

LysAlaValAspPhelleProValGluAsnLeuGluThrThrMetArgSerProValPhe 
2221 CTAAGGCGGTGGACTTTATCCCTGTGGAGAACCTAGAGACAACCATGAGGTCCCCTC 

GATTCCGCCACCTGAAATAGGGACACCTCTTGGATCTCTGTTGGTACTCCAGGGGCCACA 

ThrAspAsnSerSerProProValValProGlnSerPheGlnValAlaHisLeuHisAla 
2281 TCACGGATAACTCCTCTCCACCAGTAGTGCCCCAGAGCTTCCAGGTGGCTCACCTCCATG 
AGTGCCTATTGAGGAGAGGTGGTCATCACGGGGTCTCGAAGGTCCACCGAGTGGAGGTAC 

ProThrGlySerGlyLysSerThrLysValProAlaAlaTyrAlaAlaGlnGlyTyrLys 
2341 CTCCCACAGGCAGCGGCAAAAGCACCAAGGTCCCGGCTGCATATGCAGCTCAGGGCTATA 
GAGGGTGTCCGTCGCCGTTTTCGTGGTTCCAGGGCCGACGTATACGTCGAGTCCCGATAT 

ValLeuValLeuAsnProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLys 
2401 AGGTGCTAGTACTCAACCCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTC^ 
TCCACGATCATGAGTTGGGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGT 

AlaHisGlylleAspProAsnlleArgThrGlyValArgThrlleThrThrGlySerPro 
2461 AGGCTCATGGGATCGATCCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCC 
TCCGAGTACCCTAGCTAGGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGG 

IleThrTyrSerThrTyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyr 
2521 CCATCACGTACTCCACCTACX^CAAGTTCCTTGCCGACGGOK^TGCTCGGGGGGCGCTT 
GGTAGTGCATGAGGTGGATGCCGTTCAAGGAAO^rrc 

AspIlellelleCysAspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGly 
2581 ATCACATAATAATTTGTGACGAGTGCCACTCCAOGGATGCCACATCCATCTTGGGCATCG 
TACTGTATTATTAAACACTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAGC 

ThrValI/*uAspGlnAlaGluThrAla^ 
2641 GCACTGTCCTTGACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCG 
CXSTGACAGGAACTGGTTCGTCTCTGACGCCCCCGCT^ 

ProProGlySerValThrValProHlsPrcAsnll^ 
2701 CCCCTCO^GCTCCGTCACTGTGCCCCATCCX^ 

GGGGAGGCCGGAGGCAGTGACACGGGGTAGGGTTGTAGCTC^ 

GlyGluIleProPheTyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHis 

2761 COGGAGAGATCCCTTTTTACGGCAAGGCTATCCCCCTCGAAGTAA 
GGCCTCTCTAGGGAAAAATGCCGTTCCGATAGGGGGAGCTTCATT^ 

Leul lePheCy sHisSerLy sLy sLy sCy sAspGluLeuAlaAlaLy sLeuValAlaLeu 
2821, ATG^fCTTCTGTCA^ 

r vG^ 

2881 TGOGCATCAATGCCGTGGCCTACTACCGCGGTCTTGACG 

ACCCGTAGTTACGGCACCGGATGATGGCGCCAGAACTGCACA(^ 
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AspValValValValAlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSer 
2941 GCGATGTTGTCGTCGTGGCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACT 
GGCTACAACAGCAGCACCGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGA 

VallleAspC^sAsnThrCysValThrGlnThrValAspPheSerLeuAspProThrPhe 
3001 CGGIX^TAGACTGCAATACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCT 
GCCACTATCTGACGTTATGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGA 

ThrlleGluThrlleThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArg 
3061 TCACCATTGAGACAATCACGCTCCCCCAGGATGCTGTCTCCCGCACTCAACGTCGGGGCA 
AGTGGTAACTCTGTTAGTGOGAGGGGGTCCTACGACAGAGGGCGTGAGTTGCAGCCCCGT 

* *** 

Thi^lyArgGlyLysProGlylleTyrArgPheValAlaProGlyGluArgProSerGly 
3121 GGACTGGC^GGGGGAAGCCAGGCATCTACAGATTTG 

CCTGACCGTCCCCCTTCGGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGC 

MetPheAspSerSexValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeu 
3181 GCATGTTCGACTCGTCCGTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATCAGC 
CGTACAAiGCTGAGCAGGCAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCG 

ThrProMaGluThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProVal 
3241 TCACGCCCGCCGAGACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCOG 
AGTGCGGGCGGCTCTGATGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGC 

CysGlnAspHisLeuGluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAla 
3301 TGTGCCAGGACCATCTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATG 
ACACGGTCCTGGTAGAACTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTAC 

HisPheLeuSerGlnThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGln 
3361 CCCACTTTCTATCCCAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACC 
GGGTGAAAGATAGGGTCTGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGG 

AlaThrValCysAlaArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCys 
3421 AAGCCACCGTGTGCGCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGT 
TTCGGTGG^CACGCGATCCCGAGTTCGGG 

I^uIleArgLeuLysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAla 
3481 GTTTGATTCGCCTCAAGCCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCG 
CAAACTAAGCGGAGTTCGGGTGGGAGGTACCCGGTTGTGGGGACGATAtGTCTGACCCGC 

ValGlnAsnGluIleThrLeuThrHisProValThrLysTyrllCMetThrCysMetS^ 
3541 CTGTTCAGAATGAAATCACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGT 
GACAAGTCTTACTTTAGTGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACA 

AlaAspI^uGluValValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeu 
3601 CGGCOGACCTGGAGGTOGTCACGAGCACCIXK9GTGCTCGTTGGCGGCGTCCTGGCTGCTT 
GCCGGCTGGACCTCCAGCAGTGCTOGTGGACCCACGAGCAACCGCCGCAGGACCGACGAA 

MaAlaTyrCysI^uSerThrGlyCysValVallleValGlyArgValVall^uSerGly 
3661 TGGCCGCGTATTGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTC 

ACCGGCGCATAACGGACAGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGC 

LysProAlallelleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGlu 
3721 GGAAGCCGGCAATCATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAG 
CCTTCGGCCGTTAGTATGGACTGTCCCTTCAGGAGATGGCTCTC 

. CysSerGlnHisI^uProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGln 
3781 AGTGCTCTCAGCACTTACCGTACATCGAGCAAGGGATGATGCTCGCOGAGCAGTTCAAGC 
TCACGAGAGTCGTCAATGGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTT^ 

• Lysid^uiGiyLe^ 
3 84 IV PAGAAGGCCCim^CTC^ 

GinThrAsnTrpGlnLysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSer 
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3901 TCCAGACCAACTGGCAAAAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCA 
AGGTCTGGT1GACCGTTTTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGT 

GlylleGlnTyrLeuAlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeu 
3961 GTGGGATACAATACTTGGCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCAT 
CACCCTATGTTATGAACCGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTA 

MetAIaPheThrAlaMaValThrSerProLeuThrThrSerGlnThrteuLeuPheAsn 
4021 TGATGGCTTTTACAGCTGCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCA 
ACTACCGAAAATCTCGACGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGT 

Ilel^uGlyGlyTrpValAlaMaGlnl^uMaMaProGlyA^ 
4081 ACATATTGGGGGGGTGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTG 
TGTATAACCCCCCCACCCACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAAC 

GlyAlaGlyLeuAlaGlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAsp 
4141 TGGGCGCTGGCTTAGCTGGCGCCGCCATCGGCAGTCTTGGACTGGGGAAGGTCCTCATAG 
ACCCGCGACCGAATCGACCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATC 

Ilel^uAlaGlyTyrGlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSer 
4201 ACATCCTTGCAGGGTATGGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGA 
TGTAGGAACGTCCCATACCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACT 

GlyGluValProSerThrGluAi5pLeuValAsnI^uI^uProAlaIleLeiiScrProGly 
4261 GCGGTGAGGTCCCCTCCACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCG 
CGCCACTCCAGGGGAGGTGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGC 

AlaLeuValValGlyValValCysAlaAlalleLeuArgArgHisValGlyProGlyGlu 
4321 G AGCCC TCGT AGTCGGCGTGGTCTG TGCAG CAAT ACTG CGCCGGCACGTTGGCCCGGGCG 
CTCGGGAGCATCAGCCGCACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGC 

GlyAlaValGlnTrpMetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSer 
4381 AGGGGGCAGTGCAGTGG ATGAACCGGCTG ATAG CCTTCGCCTCCCGGGGGAACCATGTTT 
TCCCCCGTCACGTCACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAA 

ProThrHisTyrValProGluSerAspAlaAlaAlaArgValThrAlallcLeuSerSer 
4441 CCCCCACGCACTACGTGCCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCA 
GGGGGTGCGTGATGCACGGCCTCTCGCTACGTCGACGGGCGCAGTGAOGGTATGAGTCGT 

LeuThrValThrGlnl^uI^uArgArgl^uHisGlnTrpIleSerSerGluCysThrThr 
4 501 GCCTCACTGTAACCCAGCTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCA 
CGGAGTGACATTGGGTCGAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGT 

ProCysSerGlySerTrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAsp 
4561 CTCCATGCTCCGGTTCCTGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCG 
GAGGTACGAGGCCAAGGACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGC 

PheLysThrTrpLeuLysAlaLysLeuMetProGlnLeuProGlylleProPheValSer 
4621 ACTTTAAGACCTGGCTAAAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGT 
TGAAATTCTGGACCGATTTTCGATTCGAGTACX^TGTCGACGGACCCT 

CysGlnArgGlyTyrLysGlyValTrpArgValAspGlylleMetHisThrArgCysHis 
4681 CCTGCCAGCGCGGGTATAAGGGGGTCTGGCGAGTGGACGGCATCATGCACACTCGCTGCC 
GGACGGTCGCGCCCATATTCCCCCAGACCGCTCACCTGCCGTAGTACGTGTGAGCGACGG 

CysGlyAlaGluIleThrGlyHisValLysAsnGlyThrMetArglleValGlyProArg 
4741 ACTGTGGAGCTGAGATCACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTA 
- TGA^CCTCGACTCTAGTGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGAT 

ThrCysArgAsnMetTrpSerGlyThrPheP 

. 4801: ggacctgcaggaacatgtggagtgtcaccttcccc^tta^ 
CCtkacgtcgitotacacct 

ThrProLe^ 

4861 GTAiXCCCCTTCCTGGGCCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAAT 
C^TGGGGGGAAGGACGCGGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTC 

FIG. 47-5 



174 



EP 0 318 216 B1 



ValGluIleArgGlnValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeu 
4921 ATG TGG AG ATAAGGCAGGTGGGGGACTTCCACTACG TG ACGGGTATGACTACTG ACAATC 
TACACCTCTATTCCGTCCACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAG 

LysCysProCysGlnValProSerProGluPhePheThrGliiLeuAspGlyValArgLeu 
4981 TCAAATGCCCGTGCCAGGTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCC 
AGTTTACGGGCACGGTCCAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGG 

HisArgPheAlaProProCysLysProLeuLeuArgGluGluValSerPheArgValGly 
5041 TACATAGGTTTGCGCCCCCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAG 
ATGTATCCAAACGCGGGGGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATC 

LeuHisGluTyrProValGlySerGlnI>uP 
5101 GACTCCACGAATACCCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACOGGACGTGGCCG 
CTGAC^TGCTTATGGGCCATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGC 

LeuThrSerMetLeuThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeu 
5161 TGTTGACGTCCATGCTCACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGT 
ACAACTGCAGGTACGAGTGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCA 

AlaArgGlySerProProSerValAlaSerSerSerAlaSerGlziLeuSerAlaProSer 
5221 TGGCGAGGGGATCACCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCAT 
ACCGCTCCCCTAGTGGGGGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTA 

I^uLysAlaThrCysThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsn 
5281 CTCTCAAGGCAACTTGCACCGCTAACCATCACTCCCCTGATGCTGAGCTCATAGAGGCCA 
GAGAGTTCCGTTGAACGTGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCGGGT 

LeuLeuTrpArgGlnGluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysVal 
5341 ACCTCCTATGGAGGCAGGAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAG 
TGGAGGATACCTCCGTCCTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTC 

VallleLeuAspSerPheAspProLeuValAlaGluGluAspGluArgGluIleSerVal 
5401 TGGTGATTCTGGACTCCTTCGATCCGCTTGTGGCGGAGGAGGACGAGOGGGAGATCTCOG 
ACCACTAAGACCTGAGGAAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCXCTAGAGGC 

ProAlaGluIleLeuArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArg 
5461 TACCCGCAGAAATCCTGOGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCOGTTTGGGCGC 
ATGGGCGTCTTTAGGACGCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCG 

ProAspTyrAsnProProLeuValGluThrTrpLysLysProAspTyrGluProProVal 
5521 GGCCGGACTATAACCCCCCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCPG 
CCGGCCTGATATTGGGGGGCGATCACCTCTGCACCTTTTTGGGGCTGATGCTTGGTGGAC 

ValHisGlyCysProLeuProProProLysSerProProValProProProArgLysLys 
5581 TGGTCCATGGCnGTCCGCTTCCACCTCCAAAGTCCCCTCCTG 

ACCAGGTACCGACAGGCGAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCT 

ArgThrValValLeuThzGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrAx^ 
5641 AGCGGACGGTGGTCCTC^CTGAATCAACCCXATCTACTGCCTTGGCOGAGCTOGCCAC^ 
TCGCCTGCCACCAGGAGTGACTTAGTTGGGATAGATGAOGGAACCGGCTCGAGCGGTGGT 

SerPheGlySerSerSerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGlu 
5701 GAAGCTTTGGGAGCTCCTCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTG 
CTTCGAAACCGTCGAGGAGTTGAAGGCCGTAATC^ 

ProAlaProSerGlyCysProProAspSerAspAlaGluSerTyrSerSerMetProPro 
5761 AGCCCGCCCCTTCTGGCTGCCCCCCCGACTCCG^ 

TCGGGCGGGGAAGACCGAOGGGGGGGCTGAGGCTGCX9ACTCAGGATAAGGAGGTACGGGG 

LeuGluGlj^luP^ 
5821 CCCTGGAGGGGGAGCCTGGGGATCGGGATCTTAGCGACGGGTCATGGTCAACGGTC 

GGGACCTCCCCCTCGGACC^ 

GluAl aAsnAlaGluAspValVa ICy sCysSerMetSerTy r SerTrpThrG lyAlaLeu 
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5881 GTGAGGCCAACGCGGAGGATGTCGTGTGCTGCTCAATGTCTT ACTCTTGGACAGGCGCAC 
CACTCCGGTTGCGCCTCCTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTG 

ValThrProCysAlaAlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeu 
5941 TCGTCACCCCGTGCGCCGCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGT 
AGCAGTGGGGCACGCGGCGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCA 

LeuArgHisHisAsnl^uValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLys 
6001 TGCTACGTCACCACAATTTGGTGTATTCCACC^CCTCACGC^GTGCTTGCCAAAGGCAGA 
ACGATGCAGTGGTGTTAMCCACATAAGGTGGTGGAGTGCGTCAC^ 

LysValThrPheAspArgl^uGlnVall^iiAspSerHisTyrClnAspVall^uLysGlu 
6061 AG AAAG TCACATTl^CAGACTGCAAGTTCTGGACAGCCATTACCAGG ACGT AC TCAAGG 
TCTTTCAGTGTAAACTCTCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTTC 

ValLysAlaAlaUaaSerLysValLysAlaAsnLeiiLeuSerValGluGluAlaCysSer 
6121 AGGTTAAAGCAGCGGCGTCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCA 
TCCAATTTCGTCGCQ5CAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGT 

LeuThrProProHisSerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCys 
6181 GCCTGACGCCCCCACACTCAGCCAAATCCAAG TTTGGTTATGGGGCAAAAG ACGTCCGTT 
CGGACTGCGGGGGTGTGAGTCGGTTTAGGTTCAAACCAATACCCCG TTTTCTGC AGGCAA 

HisAlaArgLysAlaValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsn 

6241 GCCATGCCAGAAAGGCCGTAACCCACATCAACTCCGTG TGGAAAGACCTTCTGG AAGACA 
CGGTACGGTCTTTCCGGCATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGT 

ValThrProIleAspThrThrlleMetAlaLysAsnGluValPheCysValGlnProGlu 
6301 ATGTAACACCAATAGACACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTG 
TACATTGTGGTTATCTGTGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGAC 

LysGlyGlyArgLysProAlaArgLeuIleValPheProAspLeuGlyValArgValCys 
6361 AGAAGGGGGGTCGTAAGCCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGT 
TCTTCCCCCCAGCATTCGGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACA 

GluLysMetAlaLeuTyrAspValValThrLysLeuProLeuAlaValMetGlySerSer 
6421 GCGAAAAGATGGCTTTGTACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCT 
CGCTTTTCTACCGAAACATGCTGCACCi^TGTTTCGAGGGGAACCGGCACTACCCTTCGA 

TyrGlyPheGlnTyrSerPrcsGlyGlnArgValGluPheLeuValGlnAlaTrpLysSer 
6481 CCTACGGATTCCAATACTC^CCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGT 
GGATGCCTAAGGTTATGAGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCA 

LysLysThrPrcMetGlyPheSerTyrAspT^ 
6541 CCAAGAAAACCCCAATGGGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTG 
GGTTCTTTTGGGGTTACCCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGAC 

SerAspIlei^ThrGluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArg 
6601 A6AGCGACATCCGTACGGAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCC 
TCTCGCTGTAGGCATGCCTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGG 

ValAlalleLysSerLeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArg 

6661 GCGTGGCCATCAAGTCCCTCACCGAGAGGCTTTATGTTGGGGGCCCTC^ 

CGCACCGGTAGTTCAGGGAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTT 

GlyGluAsnCysGlyTyrArgAr^CysArgMaSerGlyVall^uThrThrSerCysGly 

6721 GGGGGG AG AACTGCGGCTATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTG 
CCCCCCtCTTGACGCCGATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACAC 

AsnThrl^uThrCysTyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAsp 

6781 GT&VCACC£TCACTTGCTA<^TCAAGGCCCGGG<^GCCTGTC 
C^^TGGGAGTGAACGATGTAGTOCC^ 

^YsThtMetLeuValCysG lyAspAspLeuValVallleCyfiGluSerAlaGlyValGln 
6841 ACTCCACCATCCTCGTGTCTGGCGACGACTTAGTCGTTATC 

TGACGTGGTACGAGCACACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGG 
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GluAspAlaAlaSerLeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProPro 

6901 aggaggaogoggcgascctgagagccttcacggaggctatgaccaggtactcc6cccccc 
tcctcctgotccgctoggactc^ 

GlyAspProProGlnProGluTyr AspLeuGluLeuIleThrSerCysSerSer AsnVal 
6961 CTGGGGACCCCCCACAACCAGAATACGACTTGGAGCTCATAACATCATGCTC 
GACCCCTGGGGGGTGTTGGTCTTATGCTGAACCTGGAGTATTGTAG 

SerVaLlMaHisAspGlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThr 
7021 TGTCAGTCGCCCACGACGGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAA 
ACAGTCAGCGGGTGCTGCC^CGACCTTTCTCC 

ProL^uMaArgAlaMaTrpGluThrAl2^0^HlsThrProValAsnSerTrpLeuGly 
7081 CCCCCCTCGCGAGAGCTGCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAG 
GGGGGGAGCGCTCTCGACGCACCCTCTGTCGTTCTCTC 

AsnllelleMetPheAlaProThrLeuTrpAlaArgMetlleLeuMetThrHlsPhePhe 
7141 GGAACATAATCATGTTTGCCCCCACACTGTGGGOGAGGATGATACTGATGACCCATTTCT 
CGTTGTATTAGTACAAACGGGGGTGTGACACCCGCTCCIACTATGACTACTGGGTAAAGA 

SerValLeu I leAlaArgAspGlnLeuG luGlnAlaLeuAspCy sGluIleTyrGlyAla 
7201 TTAGCGTCCTTATAGCCAGGGACCAGCTTGAACAGGCCCTOGATTGCGAGATCTACGGGG 
AATCGCAGGAATATCGGTCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCC 

CysTyrSerlleGluProLeuAspLeuProProIlelleGlnArgliftu 
7261 CCTXK!TACTCCATAGAACCACTTGATCTACCTCCAATCATTCAAAGACTC 
GGACGATGAGGTATCTTGGTGAACTAGATGGAGGTTAGTAAGTTTCTGAG 
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